We’re building a search engine for academic literature–for everyone

Huzzah! Today we’re announcing an $850k grant from the Arcadia Fund to build a new way for folks to find, read, and understand the scholarly literature.

Wait, another search engine? Really?

Yep. But this one’s a little different: there are already a lot of ways for academic researchers to find academic literature…we’re building one for everyone else.

We’re aiming to meet the information needs of citizen scientists, patients, K-12 teachers, medical practitioners, social workers, community college students, policy makers, and millions more. What they all have in common: they’re folks who’d benefit from access to the scholarly record, but they’ve historically been locked out. They’ve had no access to the content or the context of the scholarly conversation.

Problem: it’s hard to access to content

Traditionaly, the scholarly literature was paywalled, cutting off access to the content. The Open Access movement is on the way to solving this: Half of new articles are now free to read somewhere, and that number is growing. The catch is that there are more than 50,000 different “somewheres” on web servers around the world, so we need a central index to find it. No one’s done a good job of this yet (Google Scholar gets close, but it’s aimed at specialists, not regular people. It’s also 100% proprietary, closed-source, closed-data, and subject to disappearing at Google’s whim.)

Problem: it’s hard to access to context

Context is the stuff that makes an article understandable for a specialist, but gobbledegook to the rest of us. So that includes everything from field-specific jargon, to strategies for on how to skim to the key findings, to knowledge of core concepts like p-values. Specialists have access to context. Regular folks don’t. This makes reading the scholarly literature like reading Shakespeare without notes: you get glimmers of beauty, but without some help it’s mostly just frustrating.

Solution: easy access to the content and context of research literature.

Our plan: provide access to both content and context, for free, in one place. To do that, we’re going to bring together an open a database of OA papers with a suite AI-powered support tools we’re calling an Explanation Engine.

We’ve already finished the database of OA papers. So that’s good. With the free Unpaywall database, we’ve now got 20 million OA articles from 50k sources, built on open source, available as open data, and with a working nonprofit sustainability model.

We’re building the “AI-powered support tools” now. What kind of tools? Well, let’s go back to the Hamlet example…today, publishers solve the context problem for readers of Shakespeare by adding notes to the text that define and explain difficult words and phrases. We’re gonna do the same thing for 20 million scholarly articles. And that’s just the start…we’re also working on concept maps, automated plain-language translations (think automatic Simple Wikipedia), structured abstracts, topic guides, and more. Thanks to recent progress in AI, all this can be automated, so we can do it at scale. That’s new. And it’s big.

The payoff

When Microsoft launched Altair BASIC for the new “personal computers,” there were already plenty of programming environments for experts. But here was one accessible to everyone else. That was new. And ultimately it launched the PC revolution, bringing computing the lives of regular folks. We think it’s time that same kind of movement happened in the world of knowledge.

From a business perspective, you might call this a blue ocean strategy. From a social perspective (ours), this is a chance to finally cash the cheques written by the Open Access movement. It’s a chance to truly open up access to the frontiers of human knowledge to all humans.

If that sounds like your jam, we’d love your support: tell your friends, sign up for early access, and follow us for updates. It’s gonna be quite an adventure.

Here’s the press release.

Leave a Reply