New partnership with Clarivate to help oaDOI find even more Open Access

We’re excited to announce a new partnership with Clarivate Analytics! 

This partnership between Impactstory and Clarivate will help fund better coverage of Open Access in the oaDOI database. The  improvements will grow our index of free-to-read fulltext copies, bringing the total number to more than 18 million, along with 86 million article records altogether. All this data will continue to be freely accessible to everyone via our open API.

The partnership with Clarivate Analytics will put oaDOI data in front of users at thousands of new institutions, by integrating our index into the popular Web of Science system.  The oaDOI API is already in use by more than 700 libraries via SFX, and delivers more than 500,000 fulltext articles to users worldwide every day.  It also powers the free Unpaywall browser extension, used by over seventy thousand people in 145 countries.  

You can read more about the partnership in Clarivate’s press release.  We’ll be sharing more details about improvements in the coming months.  Exciting!

Introducing Unpaywall: unlock paywalled research papers as you browse

Last Friday night we tweeted about a new Chrome extension we’ve been working on. It’s called Unpaywall, and it links you to free fulltext as you browse research articles. Hit a paywall? No problem: click the green tab and read it free.

Unpaywall is powered by an index of over ten million legally-uploaded, open-access resources, and it delivers. For example, in a set of 11k recent cancer research articles covered in mainstream media, Unpaywall users were able to read around half of them for free–even without any subscription, and even though most of them were paywalled.

So far the response to Friday’s tweet has been amazing — 500 retweets, and in just a few days we’ve gotten more than 1500 installations: Hockey stick growth!  🙂


And we’ve also gotten rave reviews, like this one from Sarah:

Why the excitement?  Finding free, legal, open access is now super easy — it happens automatically.  With the Unpaywall extension, links to open access are automatically available as you browse.

This is useful for researchers like Ethan.  It’s also really helpful for people outside academia, who don’t enjoy the expensive subscription benefits of institutional libraries. This is especially true for nonprofits:

…. and folks working to communicate scholarship to a broader audience:

Go give it a try and see what you think! The official release is April 4th, but you can already  install it, learn more, and follow @unpaywall. We’d love your help to spread the word about Unpaywall to your friends and colleagues. Together we can accelerate toward to a future of full #openaccess for all!




behind the scenes: cleaning dirty data

Dirty Data.  It’s everywhere!  And that’s expected and ok and even frankly good imho — it happens when people are doing complicated things, in the real world, with lots of edge cases, and moving fast.  Perfect is the enemy of good.

Thanks for the image

Alas it’s definitely behind-the-scenes work to find and fix dirty data problems, which means none of us learn from each other in the process.  So — here’s a quick post about a dirty data issue we recently dealt with 🙂  Hopefully it’ll help you feel comradery, and maybe help some people using the BASE data.

We traced some oaDOI bugs to dirty records from PMC in the BASE open access aggregation database.

Most PMC records in BASE are really helpful — they include the title, author, and link to the full text resource in PMC.  For example, this record lists valid PMC and PubMed urls:

and this one lists the PMC and DOI urls:

The vast majority of PMC records in BASE look like this.  So until last week, to find PMC article links for oaDOI we looked up article titles in BASE and used the URL listed there to point to the free resource.

But!  We learned!  There is sometimes a bug!  This record has a broken PMC url — it lists with no PMC id in it (see, look at the URL — there’s nothing about it that points to a specific article, right?).  To get the PMC link you’d have to follow the Pubmed link and then click to PMC from there.  (which does exist — here’s the PMC page which we wish the BASE record had pointed to).

That’s some dirty data.  And it gets worse.  Sometimes there is no pubmed link at all, like this one (correct PMC link exists):

and sometimes there is no valid URL, so there’s really no way to get there from here:

(pretty cool PMC lists this article from 1899, eh?.  Edge cases for papers published more than 100 years ago seems fair, I’ve gotta admit 🙂 )

Anyway.  We found this dirty PMC data in base is infrequent but common enough to cause more bugs than we’re comfortable with.  To work around the dirty data we’ve added a step — oaDOI now uses the the DOI->PMCID lookup file offered by PMC to find PMC articles we might otherwise miss.  Adds a bit more complexity, but worth it in this case.



So, that’s This Week In Dirty Data from oaDOI!  🙂  Tune in next week for, um, something else 🙂

And don’t forget Open Data Day is Saturday March 4, 2017.   Perfect is the enemy of the good — make it open.

oaDOI integrated into the SFX link resolver

We’re thrilled to announce that oaDOI is now available for integration with the SFX link resolver. SFX, like other OpenURL link resolvers, makes sure that when library users click a link to a scholarly article, they are directed to a copy the library subscribes to, so they can read it.

But of course, sometimes the library doesn’t subscribe. This is where oaDOI comes to the rescue. We check our database of over 80 million articles to see if there’s a Green Open Access version of that article somewhere. If we find one, the user gets directed there so they can read. Adding oaDOI to SFX is like adding ten million open-access articles to a library’s holdings, and it results in a lot more happy users, and a lot more readers finding full text instead of paywalls. Which is kind of our thing.

The best part is, it’s super easy set up, and of course completely free. Since SFX is used today by over 2000 institutions, we’re really excited about how big a difference this can make.

Edited march 28, 2017. There are now over 600 libraries worldwide using the oaDOI integration, and we’re handling over a million requests for fulltext every day.


Introducing oaDOI: resolve a DOI straight to OA

Most papers that are free-to-read are available thanks to “green OA” copies posted in institutional or subject repositories.  The fact these copies are available for free is fantastic because anyone can read the research, but it does present a major challenge: given the DOI of a paper, how can we find the open version, given there are so many different repositories?screen-shot-2016-10-25-at-9-07-11-am

The obvious answer is “Google Scholar” 🙂  And yup, that works great, and given the resources of Google will probably always be the most comprehensive solution.  But Google’s interface requires an extra search step, and its data isn’t open for others to build tools on top of.

We made a thing to fix that.  Introducing oaDOI:

We look for open copies of articles using the following data sources:

  • The Directory of Open Access Journals to see if it’s in their index of OA journals.
  • CrossRef’s license metadata field, to see if the publisher has reported an open license.
  • Our own custom list DOI prefixes, to see if it’s in a known preprint repository.
  • DataCite, to see if it’s an open dataset.
  • The wonderful BASE OA search engine to see if there’s a Green OA copy of the article. BASE indexes 90mil+ open documents in 4000+ repositories by harvesting OAI-PMH metadata.
  • Repository pages directly, in cases where BASE was unable to determine openness.
  • Journal article pages directly, to see if there’s a free PDF link (this is great for detecting hybrid OA)

oaDOI was inspired by the really cool DOAI.  oaDOI is a wrapper around the OA detection used by Impactstory. It’s open source of course, can be used as a lookup engine in Zotero, and has an easy and powerful API that returns license data and other good stuff.

Check it out at, let us know what you think (@oadoi_org), and help us spread the word!

Data-driven decisions with Net Promoter Score

Today we’re releasing some changes in the way users sign up for Impactstory profiles, based on research we’ve done to learn more about our users. It’s a great opportunity to share a little about what we learned, and to describe the process we used to do this research–both to add some transparency around our decision making, and to maybe help folks looking to do the same sorts of things. There’s lots to share, so let’s get to it:

Meet the Net Promoter Score

As part of our journey to find product-market fit for the Impactstory webapp, we’ve become big fans of the Net Proscreen-shot-2016-09-15-at-7-26-10-pmmoter Score (NPS), an increasingly popular way to assess how much value users are getting from one’s product. It’s appealingly simple: we ask users to rank how likely they’d be to recommend Impactstory to a colleague, on a scale of 1-10, and why. Answers of 9-10 are Promoters, from 1-6 are Detractors. You subtract %detractors from %supporters and there’s your score.

It’s a useful score. It doesn’t measure how much users like you. It doesn’t measure how much they generally support the idea of what you’re doing. It measures how much you are solving real problems for real users, right now. Solving those problems so well that users will put their own reputation on the line and sing your praises to their friends.

Until we’re doing that, we don’t have product-market fit, we aren’t truly making something people want, and we don’t have a sustainable business. Same as any startup.

As a nonprofit, we’ve got lots of people who support what we’re doing and (correctly!) see that we’re solving a huge problem for academia as a whole. So they’ve got lots of good things to say to us. Which: yay. That’s fuel and we love it. But it can disguise the fact that we may not be solving their personal problems. We need to get at that signal, to help us find that all-important product-market fit.

Getting the data

We used to manage creating, sending, and collecting emails surveys. It just works and it saved us a ton of time. We recommend it.  Our response rate was 28%, which is we figure pretty good for asking help via email from people who don’t know you or owe you anything, and without pestering them with any reminders. We sliced and diced users along many dimensions but they all had about the same response rate, which improves robustness of the findings. Since we assume users who have no altmetrics will hate the app, we only sent surveys to users with relatively complete profiles (at least three Impactstory badges).

Once we had responses, we followed up using Intercom, an app that nicely integrates most of our customer communication (feedback, support, etc). We got lots more qualitative feedback this way.

Once we had all our data, we exported the results into a spreadsheet and had us some Pivot Table Fun Time. Here’s the raw data in Google Docs (with identifying attributes removed to protect privacy) in case you’d like to dive into the data yourself.

Finally, we exported loads of user data from our Postgres app database hosted on Heroku. All that got added into the spreadsheet and pivot tables as well.

Here’s what we found

The overall NPS is 26, which is not amazing. But it is good. And encouragingly, it’s much better than we got when we surveyed users about our old, non-free version in March. Getting better is a good sign. We’ll take it.

Users who have made profiles in both versions (new and old) seem to agree. The overall NPS for these users was 58, which is quite strong. In fact, users of the old version were the group with the highest NPS overall in this survey. Since we made a lot of changes in the new app from the old, this wouldn’t have to have been true. It made us happy.

But we wanted more actionable results. So we sliced and diced everyone into subgroups along several dimensions, looking for features that can predict extra-high NPS in future sign-ups.

We found four of these predictive features. As it happens, each predictor changes the NPS of its group by the same amount: your NPS (on average) goes from 15 (ok) to 35 (good) if you

  1. have a Twitter account,
  2. have more than 20 online mentions of some kind (Tweets, Wikipedia, Pinterest, whatever) pointing to your publications,
  3. have made more than 55% of your publications green or gold open access, or
  4. have been awarded more than 6 Impactstory badges.

Of these, (4) is not super useful since it covaries a lot with numbers of mentions (2) and OA percentage (3); after all, we give out badges for both those things. A bit more surprisingly, users who have Twitter are likely to have more mentions per product, and less likely to have blank profiles, meaning Feature 1 accounts for some of the variance in Feature 2. So simply having a Twitter account is one of our best signals that you’ll love Impactstory.

Surprisingly, having a well-stocked ORCID profile with lots of your works in it doesn’t seem to predict a higher NPS score at all. This was unexpected because we figured the kind of scholcomm enthusiasts who keep their ORCID records scrupulously up-to-date would be more likely to dig the kind of thing we’re doing with Impactstory. Plus they’d have an easier and faster time setting up a profile since their data is super easy for us to import. Good to have the data.

About 60% of response included qualitative feedback. Analysing these, we found three themes:

  • It should include citations. Makes sense users would want this, given that citations are the currency of academia and all. Alas they ain’t gonna get it, not till someone comes out with a open and complete citation database. Our challenge is to help users be less bummed about this, hopefully be positioning Impactstory as a complement to indexes like Google Scholar rather than a competitor.
  • It’s pretty. That’s good to hear, especially since we want folks to share their profiles, make them part of their online identity. That’s way easier if you think it looks sharp.
  • It’s easy. Also great to hear, because the last version was not very easy, mostly as a result of feature bloat. It hurt to lose some features on this version, so it’s good to see the payoff was there.
  • It puts everything all in one place.  Presumably users were going to multiple places to gather all the altmetrics data that Impactstory puts in one spot. 

Here’s what we did

The most powerful takeway from all this was that users who have Twitter get more out of Impactstory and like it more. And that makes sense…scholars with Twitter are more likely be into this whole social media thing, and (in our experience talking with lots of researchers) more ready to believe altmetrics could be a useful tool.

So, we’ll redouble our focus on these users.

The way we’re doing that concretely right away is by changing the signup wizard to start with a “signup with Twitter” button. That’s a big deal because it means you’ll need a Twitter account to sign up, and therefore excludes some potential users. That’s a bummer.

But it’s excluding users who, statistically, are among the least likely to love the app. And it’s making it easier to sign up for the users that are loving Impactstory the most, and most keen to recommend us. That means better word of mouth, a better viral coefficient, and a chance to test a promising hypothesis for achieving product-market fit.

We’re also going to be looking at adding more Twitter-specific features like analysing users’ tweeted content and follower lists. More on that later.

To take advantage of our open-access predictor, we’ll be working hard to reach out to the open access community…we’re already having great informal talks with folks at SPARC and with the OA Button, and are reaching out in other ways as well. More on that later, too.

We’re excited about this approach to user-driven development. It’s something we’ve always valued, but often had a tough time implementing because it has seemed a bit daunting. And honestly, it is a bit daunting. It took a ton of time, and it takes a surprising amount of mental energy to be open-minded in a way that makes the feedback actionable. But overall we’re really pleased with the process, and we’re going to be doing it more, along with these kinds of blog posts to improve the transparency decision-making. Looking forward to hearing your thoughts!

Now, a better way to find and reward open access

There’s always been a wonderful connection between altmetrics and open science.

Altmetrics have helped to demonstrate the impact of open access publication. And since the beginning, altmetrics have excited and provoked ideas for new, open, and revolutionary science communication systems. In fact, the two communities have overlapped so much that altmetrics has been called a “school” of open science.

We’ve always seen it that way at Impactstory. We’re uninterested in bean-counting. We are interested in setting the stage for a second scientific revolution, one that will happen when two open networks intersect: a network of instantly-available diverse research products and a network of comprehensive, open, distributed significance indicators.

So along with promoting altmetrics, we’ve also been big on incentives for open access. And today we’re excited that we got a lot better at it.

We’re launching a new Open Access badge, backed by a really accurate new system for automatically detecting fulltext for online resources. It finds not just Gold OA, but also self-archived Green OA, hybrid OA, and born-open products like research datasets.

A  lot of other projects have worked on this sticky problem before us, including the Open Article Gauge, OACensus, Dissemin, and the Open Access Button. Admirably, these have all been open-source projects, so we’ve been able to reuse lots of their great ideas.

Then we’ve added oodles of our own ideas and techniques, along with plenty of research and testing. The result? Impactstory is now the best, most accurate way to automatically assess openness of publications. We’re proud of that.

And we know this is just the beginning! Fork our code or send us a pull request if you want to make this even better. Here’s a list of where we check for OA to get you started:

  • The Directory of Open Access Journals to see if it’s in their index of OA journals,
  • CrossRef’s license metadata field,  to see if the publisher has uploaded an open license.
  • Our own custom list DOI prefixes, to see if it’s in a known preprint repo
  • DataCite, to see if it’s an open dataset.
  • The wonderful BASE OA search engine to see if there’s a Green OA copy of the article.
  • Repository pages directly, in cases where BASE was unable to determine openness.
  • Journal article pages directly, to see if there’s a free PDF link (this is great for detecting hybrid OA)

What’s it mean for you? Well, Impactstory is now a powerful tool for spreading the word about open access. We’ve found that seeing that openness badge–or OH NOES lack of a badge!–on their new profile is powerful for a researcher who might otherwise not think much about OA.

So, if you care about OA: challenge your colleagues to go make a free profile and see how open they really are. Or you can use our API to learn about the openness of groups of scholars (great for librarians, or for a presentation to your department). Just hit the endpoint to find out the openness stats for anyone.

Hit us up with any thoughts or comments, and enjoy!

Why researchers are loving the new Impactstory

We put our heart and soul into the new Impactstory and have been on pins and needles to hear what you think.  Well it’s been a week and the verdict is in — we’re hearing that the new version is awesome, fantastic, and truly excellent, a home run and must-have–an academic profile that’s exciting and relevant.

And so much more. So much more, in fact, that we wanted to a little break from the frenzied responding, bugfixing, and feature-launching we’ve been doing this week and summarize a bit of what we’ve heard.

What do you like?

A lot of users have appreciated that it now takes seconds and is super easy to set up a profile that’s blazing fast and smooth to use: it’s instant insights about your research.

Unlike speed, beauty is in the eye of the beholder–but our beholders seem delightfully agreed that our new look is great, great, great.  Whether users are calling it fresh or beautifully crafted, or sleek or smooth or snazzy, everyone seems to agree that the new version looks awesome, it looks pretty damn awesome. And we are pretty thrilled to hear that.

They’re enjoying that it’s got some fun 🙂 And, we’re not surprised to hear that people like the new price point of Free, making it easier to recommend to others.  

What’s it good for?

Impactstory helps researchers find impacts of their work beyond just citations. People have found mentions they didn’t know about on Wikipedia, discussion in cool blog posts, and reviews on Faculty of 1000. And not just numbers, but impact across the globe. Not just numbers but connecting with people: for instance user Peter van Heusden tweeted, “Using @Impactstory I discovered someone who is consistently promoting work I’m involved in, but who I had no idea existed!”

All this amounts to more than just a lovely ego boost (although it’s that too!). People are telling us that it’s motivating them to adopt more Open Science practices like uploading research slides to a proper repository, getting an ORCID, adding works to their ORCID profile, and celebrating their non-paper publications.

How are you using it?

People are already sending their Impactstory profiles to their funders, and their funders are loving them.  Researchers have added their new profile to their CV, and are planning on using Impactstory data to define innovative ‘pathway to impact’ for UK grants and in tenure and promotion packets.

Folks are including it in workshops.  And even better — building things with our open data! Check out the plugin, it rolled out impactstory support this week and it’s really cool 🙂

What have we been doing?

We’ve made a bunch of changes this week in response to your feedback:

  • imports all your publications, not just DOIs.  Everything on your ORCID profile now displays in your Impactstory profile, and we’re working on getting more openness and altmetrics data
  • twitter integration
    • connecting twitter updates your profile pic so you don’t have to fight with gravatar
    • you don’t have to enter email manually–even faster signup
    • we’ll be using your twitter feed for achievements in the future
  • there’s a new Open Sesame achievement
  • we changed the scores at the top of the profile beside your picture; they are now counts of your achievements
  • the achievements and the import process are better documented
  • we rolled out dozens of smaller features, usability enhancements, and bugfixes.

What’s next?

We’re on our way to the FORCE16 conference this week.  We’ll be rolling the feedback from the conference along with your continued feedback into continued improvement to the app.

And you?  Join in with everyone showing off their profile, spread the word (this is how we will grow), and if you don’t have a profile, get one, and tell us what you think!

Finally, thanks.

Finally, we’d like to thank the hundreds of passionate people who have helped us with money and with moral support along the way, from our early days till now. It’s safe to say the new Impactstory is a big hit.  It’s our hit, together.


The new Impactstory: Better. Freer.

We are releasing a new version of Impactstory!

We baked what we’ve learned from hundreds of conversations with researchers into a sleeker, leaner, more useful Impactstory.

Our new Achievements showcase your meaningful accomplishments, not just counts. Our new three-part score helps you track your buzz, engagement, and openness. And next-generation notification emails are improved to tell you what you want to know reliably every week.

And of course we’ve got a slew of other new features as well, including Depsy integration, ORCID sync-on-demand, and full support for mobile.

What’s more, we’re simplifying and streamlining everywhere, eliminating little-used features and doubling down on what users have told us they love. Profile creation is now only via ORCID, we only deal in DOIs, and citation metrics are gone. As a result, creating a profile takes just seconds, our support for diverse research products (preprints, datasets, etc) is bulletproof, and metrics are now consistently clear and up-to-date. Along with a complete code rewrite, these changes make Impactstory faster and more reliable than it’s ever been.

Last but not least, not only are we making Impactstory better: we’re making it cheaper. As in, all the way cheaper. Free!

Why? We heard you love the idea, but not the price–largely because your disciplines or departments aren’t quite ready to use altmetrics for evaluation. We can see this is starting to change, and want to help that change happen as quickly as possible. That means letting as many researchers as possible engage with altmetrics, right now. Free helps that happen.

Alternative sustainability models (like freemium features and new grants) will allow us to continue to build and maintain tools like Impactstory and Depsy to help change how researchers think about understanding and measuring the influence of their work.

Sound good? It is. We think you’ll love it. Go make yourself a profile and see what you learn: (and if you’re a current impactstory subscriber check your email for migration details).

We think this new Impactstory the best thing we’ve ever done, and it’s a big step towards creating the open science, altmetrics-powered future we believe in. Thanks building that future with us. We’re looking forward to hearing what you think!

Let’s value the software that powers science: Introducing Depsy

Today we’re proud to officially launch Depsy, an open-source webapp that tracks research software impact.

We made Depsy to solve a problem:  in modern science, research software is often as important as traditional research papers–but it’s not treated that way when it comes to funding and tenure. There, the traditional publish-or-perish, show-me-the-Impact-Factor system still rules.

We need to fix that. We need to provide meaningful incentives for the scientist-developers who make important research software, so that we can keep doing important, software-driven science.

Lots of things have to happen to support this change. Depsy is a shot at making one of those things happen: a system that tracks the impact of software in software-native ways.

That means not just counting up citations to a hastily-written paper about the software, but actual mentions of the software itself in the literature. It means looking how software gets reused by other software, even when it’s not cited at all. And it means understanding the full complexity of software authorship, where one project can involve hundreds of contributors in multiple roles that don’t map to traditional paper authorship.

Ok, this sounds great, but how about some specifics. Check out these examples:

  • GDAL is a geoscience library. Depsy finds this cool NASA-funded ice map paper that mentions GDAL without formally citing it. Also check out key author Even Rouault: the project commit history demonstrates he deserves 27% credit for GDAL, even though he’s overlooked in more traditional credit systems.
  • lubridate improves date handling for R. It’s not highly-cited, but we can see it’s making a different kind of impact: it’s got a very high dependency PageRank, because it’s reused by over 1000 different R projects on GitHub and CRAN.
  • BradleyTerry2 implements a probability technique in R. It’s only directly reused by 8 projects—but Depsy shows that one of those projects is itself highly reused, leading to huge indirect impacts. This indirect reuse gives BradleyTerry2 a very high dependency PageRank score, even though its direct reuse is small, and that makes for a better reflection of real-world impact.
  • Michael Droettboom makes small (under 20%) contributions to other people’s research software, contributions easy to overlook. But the contributions are meaningful, and they’re to high-impact projects, so in Depsy’s transitive credit system he ends up as a highly-ranked contributor. Depsy can help unsung heroes like Micheal get rewarded.

Depsy doesn’t do a perfect job of finding citations, tracking dependencies, or crediting authors (see our in-progress paper for more details on limitations). It’s not supposed to. Instead, Depsy is a proof-of-concept to show that we can do them at all. The data and tools are there. We can measure and reward software impact, like we measure and reward the impact of papers.

Embed impact badges in your GitHub README

Given that, it’s not a question of if research software becomes a first-class scientific product, but when and how. Let’s start having the conversations about when and how (here are some great places for that). Let’s improve Depsy, let’s build systems better than Depsy, and let’s (most importantly) start building the cultural and political structures that can use these systems.

For lots more details about Depsy, check out the paper we’re writing (and contribute!), and of course Depsy itself. We’re still in the early stages of this project, and we’re excited to hear your feedback: hit us up on twitter, in the comments below, or in the Hacker News thread about this post.

Depsy is made possible by a grant from the National Science Foundation.
edit nov 15 2015: change embed image to match new badge