Podcast episode about Unpaywall

 

I recently had a fun conversation with @ORION_opensci for their just-launched podcast.

The episode is about half an hour long, and covers what @Unpaywall is, who uses it, how it came about, a bit about how it works, thoughts on the importance of #openinfrastructure, the sustainability model, how open jives with getting money from Elsevier, #PlanS, how to help the #openscience revolution…

Anyway, here’s where you can listen (you can either load it into your Podcast app, or just press “play” on the webpage player):

https://orionopenscience.podbean.com/e/scaling-the-paywall-how-unpaywall-improved-open-access/

(Or here’s the MP3.)

Thanks for having me @OOSP_ORIONPod, it was super fun!  And do check out the rest of the episodes as well, they are covering great topics:

 

Unpaywall extension adds 200,000th active user

We’re thrilled to announce that we’re now supporting over 200,000 active users of the Unpaywall extension for Chrome and Firefox!

The extension, which debuted nearly two years ago, helps users find legal, open access copies of paywalled scholarly articles. Since its release, the extension has been used more than 45 million times, finding an open access copy in about half of those. We’ve also been featured in The Chronicle of Higher Ed, TechCrunch, Lifehacker, Boing Boing, and Nature (twice).

However, although the extension gets the press, the database powering the extension is the real star. There are millions of people using the Unpaywall database every day:

  • We deliver nearly one million OA papers every day to users worldwide via our open API…that’s 10 papers every second!
  • Over 1,600 academic libraries use our SFX integration to automatically find and deliver OA copies of articles when they have no subscription access.
  • If you’re using an academic discovery tool, it probably includes Unpaywall data…we’re integrated into Web of Science, Europe PubMed Central, WorldCat, Scopus, Dimensions, and many others.
  • Our data is used to inform and monitor OA policy at organizations like the US NIH, UK Research and Innovation, the Swiss National Science Foundation, the Wellcome Trust, the European Open Science Monitor, and many others.

The Unpaywall database gets information from over 50,000 academic journals and 5000 scholarly repositories and archives, tracking OA status for more than 100 million articles. You can access this data for free using our open API, or user our free web-based query tool. Or if you prefer, you can just download the whole database for free.

Unpaywall is supported via subscriptions to the Unpaywall Data Feed, a high-throughput pipeline providing weekly updates to our free database dump. Thanks to Data Feed subscribers, Unpaywall is completely self-sustaining and uses no grant funding. That makes us real optimistic about our ability to stick around and provide open infrastructure for lots of other cool projects.

Thanks to everyone who has supported this project, and even more, thanks to everyone who has fought for open access. Without y’all, Unpaywall wouldn’t matter. With you: we’re changing the world. Together. Next stop 300k!

We’re building a search engine for academic literature–for everyone


Huzzah! Today we’re announcing an $850k grant from the Arcadia Fund to build a new way for folks to find, read, and understand the scholarly literature.

Wait, another search engine? Really?

Yep. But this one’s a little different: there are already a lot of ways for academic researchers to find academic literature…we’re building one for everyone else.

We’re aiming to meet the information needs of citizen scientists, patients, K-12 teachers, medical practitioners, social workers, community college students, policy makers, and millions more. What they all have in common: they’re folks who’d benefit from access to the scholarly record, but they’ve historically been locked out. They’ve had no access to the content or the context of the scholarly conversation.

Problem: it’s hard to access to content

Traditionaly, the scholarly literature was paywalled, cutting off access to the content. The Open Access movement is on the way to solving this: Half of new articles are now free to read somewhere, and that number is growing. The catch is that there are more than 50,000 different “somewheres” on web servers around the world, so we need a central index to find it. No one’s done a good job of this yet (Google Scholar gets close, but it’s aimed at specialists, not regular people. It’s also 100% proprietary, closed-source, closed-data, and subject to disappearing at Google’s whim.)

Problem: it’s hard to access to context

Context is the stuff that makes an article understandable for a specialist, but gobbledegook to the rest of us. So that includes everything from field-specific jargon, to strategies for on how to skim to the key findings, to knowledge of core concepts like p-values. Specialists have access to context. Regular folks don’t. This makes reading the scholarly literature like reading Shakespeare without notes: you get glimmers of beauty, but without some help it’s mostly just frustrating.

Solution: easy access to the content and context of research literature.

Our plan: provide access to both content and context, for free, in one place. To do that, we’re going to bring together an open a database of OA papers with a suite AI-powered support tools we’re calling an Explanation Engine.

We’ve already finished the database of OA papers. So that’s good. With the free Unpaywall database, we’ve now got 20 million OA articles from 50k sources, built on open source, available as open data, and with a working nonprofit sustainability model.

We’re building the “AI-powered support tools” now. What kind of tools? Well, let’s go back to the Hamlet example…today, publishers solve the context problem for readers of Shakespeare by adding notes to the text that define and explain difficult words and phrases. We’re gonna do the same thing for 20 million scholarly articles. And that’s just the start…we’re also working on concept maps, automated plain-language translations (think automatic Simple Wikipedia), structured abstracts, topic guides, and more. Thanks to recent progress in AI, all this can be automated, so we can do it at scale. That’s new. And it’s big.

The payoff

When Microsoft launched Altair BASIC for the new “personal computers,” there were already plenty of programming environments for experts. But here was one accessible to everyone else. That was new. And ultimately it launched the PC revolution, bringing computing the lives of regular folks. We think it’s time that same kind of movement happened in the world of knowledge.

From a business perspective, you might call this a blue ocean strategy. From a social perspective (ours), this is a chance to finally cash the cheques written by the Open Access movement. It’s a chance to truly open up access to the frontiers of human knowledge to all humans.

If that sounds like your jam, we’d love your support: tell your friends, sign up for early access, and follow us for updates. It’s gonna be quite an adventure.

Here’s the press release.

When will everything be Open Access?


OA continues to grow. But when will it be…done? When will everything be published as Open Access?

Using data from our recently-published PeerJ OA study, we took a crack at answering that question. This data we’re using comes from the Unpaywall database–now the largest open database of OA articles ever created, with comprehensive data on over 90 million articles. Check out the paper for more lots more details on how we assembled the data, along with assessments of accuracy and other goodies. But without further ado, here’s our projection of OA growth:

growth-over-time

In the study, we found that OA is increasingly likely for newer articles since around 1990. That’s the solid line part of the graph, and is based on hard data.

But since the curve is so regular, it was tempting to extend it so see what would happen at the current rate of increase. That’s the dotted line in the figure above. Of course it’s a pretty facile projection, in that no effort has been made to model the underlying processes. #limitations #futurework 😀. Moreover, the 2040 number is clearly too conservative since it doesn’t account for discontinuities–like the surge in OA we’ll see in 2020 when new European mandates take effect.

But while the dates can’t be known for certain, what the data makes very clear is that we are headed for an era of universal OA. It’s not a question of if, but when. And that’s great news.

Open Access, coming to a workflow near you: welcome to the year of Ubiquitous OA


Thanks to 20 years of OA innovation and advocacy, today you can legally access around half the recent research literature for free. However, in practice, much of this free literature is not as open as we’d like it to be, because it’s hard for readers to find the OA version.

OA lives on repositories and publisher websites. But very few people visit these sources directly to find a given article. Instead, people rely on the search tools that are already part of their existing workflows. Historically, these haven’t done a great job surfacing OA resources. Google, for instance, often fails to index OA versions, in addition to indexing content of dubious provenance. OA aggregators like BASE, CORE, and OpenAIRE aim to solve this by emphasizing OA coverage, but they require researchers to add a second or third search step to their existing workflows–something researchers have been reluctant to do.

So in addition to the well-known access problem, we also have a discovery problem. On the one there’s a healthy, efficient OA infrastructure in journals and repositories. On the other we have millions of individual readers doing their own thing. We need to connect these. We need to cover this last mile between the infrastructure and the individual user, and we need to make that connection easy and seamless and ubiquitous. Until we do, OA is writing a check it can’t fully cash.

But the news is good: over the last year, several efforts are emerging to cover that last mile. Our contribution was Unpaywall: an extension that shows a green tab in your browser on articles where there’s an OA version available.  Unpaywall has enjoyed lots of success, adding over 100,000 active users in under than a year. Moreover, the backend database of Unpaywall (formerly called oaDOI) can be integrated into any number of existing tools, making it easier to spread OA content all over the place. For instance, we’re already seeing over a million uses every day from library link resolvers.

Our most recent integration takes this to a new level, and we’re so excited about it: thanks to a new partnership between Impactstory and Clarivate Analytics, data from Impactstory’s Unpaywall database is now live in the Web of Science, making it the first editorially-curated and publisher-neutral resource to implement this technology. Web of Science has been able to use Unpaywall data to discover and link to millions more OA records amongst their existing content.  This enables millions of Web of Science users around the world to link straight from their search results to a trusted, legal, peer-reviewed OA version—and they can also filter search results by the different versions of OA.

This is a big deal because article and indexing (A&I) systems like Web of Science are currently the most important way researchers access literature.  And though it’s by no means the only A&I system out there, Web of Science is the most respected and most prevalent. Every month, millions of users access literature through Web of Science—and now, each and every one of them will see more OA options for articles they might not otherwise discover, right alongside subscribed content.  Every day. What a huge change from the days we had to convince folks that OA was legitimate at all! It’s a new era.

A new era: that’s not just a hyperbolic phrase. We think this year marks the turning of a new moment in the OA narrative. We’re moving out of the author-focused, advocacy-focused initial phase, and into a more mature era of ubiquitous Open Access, characterized by deep integration of OA into researcher workflows and value-add services built on top of the immense OA corpus. This is the era of user-focused OA.

As OA becomes the default state for published research, tools that centralize, mine, index, search, organize, and extract knowledge from papers suddenly become massively more powerful.  Integrations between Unpaywall and commercial services aren’t generating this new era, but they are one of the hallmarks of it. We’re not making new OA, but rather starting to leverage the massive OA corpus now available. In the last year, many others have begun to do this as well. Many, many more will follow

For years, we in the OA advocate community have been arguing that a critical mass of OA would not just improve scholarly communication, it would transform it. This is finally beginning to happen, and we think this partnership with Web of Science is an early part of that transformation. Now, a subscription to Web of Science—something most academic libraries globally already have—is also a subscription to a database of millions of free-to-read OA articles.

We’ve never been more excited about the future of OA–or more thankful for all the work the OA community as a whole has done to get here. And we can’t wait to keep working together with the community to help make the vision of ubiquitous open access a reality.

Green Open Access comes of age


This morning David Prosser, executive director of Research Libraries UK, tweeted, “So we have @unpaywall, @oaDOI_org, PubMed icons – is the green #OA infrastructure reaching maturity?(link).

We love this observation, and not just because two of the three projects he mentioned are from us at Impactstory 😀. We love it because we agree: Green OA infrastructure is at a tipping point where two decades of investment, a slew of new tools, and a flurry of new government mandates is about to make Green OA the scholarly publishing game-changer.

A lot of folks have suggested that Sci-Hub is scholarly publishing’s “Napster moment,” where the internet finally disrupts a very resilient, profitable niche market. That’s probably true. But like music industry shut down Napster, Elsevier will likely be able shut down Sci-Hub. They’ve got both the money and the legal (though not moral) high ground and that’s a tough combo to beat.

But the future is what comes after Napster. It’s in the iTunes and the Spotifys of scholarly communication. We’ve built something to help to create this future. It’s Unpaywall, a browser extension that instantly finds free, legal Green OA copies of paywalled research papers as you browse–like a master key to the research literature. If you haven’t tried it yet, install Unpaywall for free and give it a try.

Unpaywall has reached 5,000 active users in our first ten days of pre-release.

But Unpaywall is far from the only indication that we’re reaching a Green OA inflection point. Today is a great day to appreciate this, as there’s amazing Green OA news everywhere you look:

  • Unpaywall reached the 5000 Active Users milestone. We’re now delivering tens of thousands of OA articles to users in over 100 countries, and growing fast.
  • PubMed announced Institutional Repository LinkOut, which links every PubMed article to a free Green copy in institutional repositories where available. This is huge, since PubMed is one of the world’s most important portals to the research literature.
  • The Open Access Button announced a new integration with interlibrary loan that will make it even more useful for researchers looking for open content. Along with the interlibrary loan request, they send instructions to authors to help them self-archive closed publications.

Over the next few years, we’re going to see an explosion in the amount of research available openly, as government mandates in the US, UK, Europe, and beyond take force. As that happens, the raw material is there to build completely new ways of searching, sharing, and accessing the research literature.
We think Unpaywall is a really powerful example: When there’s a big Get It Free button next to the Pay Money button on publisher pages, it starts to look like the game is changing. And it is changing. Unpaywall is just the beginning of the amazing open-access future we’re going to see. We can’t wait!