OpenAlex rewrite (“Walden”) launch!

Posted on November 3, 2025December 13, 2025 by Jason

Today, OpenAlex gets a new engine.

After a year of rebuilding, refactoring, and retesting, the Walden rewrite is now live — powering all of OpenAlex. It’s the same dataset shape you know, but faster, cleaner, and more complete.

You’ll notice better references, better OA detection, better language and license coverage, better everything. We’ve added 190 million new works, including datasets, software, and other research objects from DataCite and thousands of repositories. And thanks to our new foundation, fixes and improvements now roll out in days, not months.

Want to see exactly what changed? Check out OREO — the OpenAlex Rewrite Evaluation Overview — to compare old vs. new data in detail. [edit Dec 13, 2025: OREO is no longer up because the legacy OpenAlex data is no longer being updated…it’s all Walden now, so there’s no comparator].

And if you’d like to dig into the full list of updates, the Walden release notes have you covered.

For the next few weeks, you can still access the old dataset with data-version=1, and starting tomorrow, you can download full snapshots of both the legacy and Walden datasets in the usual way.

The rebuild is done. The road ahead is wide open.

Onward.

A Better Way to Detect Language in OpenAlex—and a Better Way to Collaborate

Posted on October 20, 2025October 20, 2025 by Kyle Demes

As part of the recent Walden system launch, we’ve improved how OpenAlex detects the language of scholarly works. The results are immediately visible in the data: many more works are now correctly recognized as non-English, new languages appear that weren’t represented at all before, and previously unclassified works now have accurate language assignments.

The chart below (source) shows the number of works attributed to each language in the Classic vs. Walden OpenAlex. Most languages fall above the diagonal line, meaning more works in Walden are classified with that language and the cluster of languages on the y-axis are all languages that had no works in Classic OpenAlex but now have works in Walden.

We’re excited about this improvement. But the story behind this improvement is just as important as the technical result—it’s a model for how the research community and open infrastructures like OpenAlex can collaborate to make real, shared progress.

From helpful critique to a true collaboration

Last year, a group of researchers published a preprint evaluating OpenAlex’s language-classification system using a large multilingual gold standard (Céspedes et al., arXiv:2409.10633v2, now published as https://doi.org/10.1002/asi.24979). We were excited to see that an international research collaborative had undertaken such a significant project using OpenAlex with the aim of improving its usefulness for the global research community. Their study was rigorous and thoughtful, and it confirmed something we already knew: our approach to language detection could be improved.

However, the paper stopped short of evaluating and recommending the concrete next steps we could take to improve language detection in OpenAlex. We hadn’t been involved at the beginning of the study to provide the authors with the kinds of metrics or performance comparisons that would actually let us deploy a better model in production. But after publication, we met with some of the authors to discuss what we needed to be able to turn their work into improvements in OpenAlex.

We needed precision and recall metrics for multiple competing candidate algorithms (with a bias towards precision); and
We needed analysis that considered cost and runtime, given that any model we deploy must scale to 400 million+ records.

The researchers enthusiastically took on the additional work— checking in with us throughout the process to make sure they were on the right track. The result was a preprint from their follow-on study, (Sainte-Marie et al., arXiv:2502.03627), that provided exactly the applied, scalable insight we needed.

Turning research into real-world impact

As part of the Walden rewrite, we implemented one of the top-recommended approaches from their study. The improvement has been dramatic:

More works are now correctly classified as non-English languages, instead of being incorrectly labeled as English.
New languages, previously absent from OpenAlex, are now detected for the first time.
Previously “null” records now have reliable language tags.

Before deploying the new model in production, we already knew from the researchers’ analyses and their multilingual gold-standard sets that it would yield a strong overall improvement across the corpus. But we wanted to confirm that in practice. So we manually reviewed a random sample of works whose language classification differed between the old and new systems—and in the vast majority of those cases, the new system was correct.

We also validated against real-world feedback. For instance, the NORA team at Research Portal Denmark had previously submitted support tickets detailing mix-ups between Danish and Norwegian, two languages that are notoriously similar in writing. In ~75% of those cases, the new system now gets it right.

A model for future collaboration

To be clear– we value and learn from every independent evaluation of OpenAlex. One-way critiques from researchers are a vital part of the open-infrastructure ecosystem, and we deeply appreciate the time and expertise the global research community is investing in making OpenAlex better.

What made this case stand out was the second step: turning that critique into a direct collaboration that produced immediately deployable improvements. By working together, we created a fast-tracked feedback loop—from identifying issues in OpenAlex, to developing and testing solutions, to rolling out fixes across hundreds of millions of records. It’s a model we’d love to repeat.

And this is only the beginning. In the next few weeks, we’ll be launching a new community curation system letting researchers and metadata experts around the world submit corrections directly to OpenAlex—creating an even faster, more transparent, and more collaborative way to improve research metadata at scale.

Stay tuned—and thank you to everyone helping make open research information better, one contribution (and one collaboration) at a time.

OpenAlex rewrite enters beta! 🎉

Posted on October 1, 2025October 1, 2025 by Jason

It’s a big week at OpenAlex. On Monday, we announced that OpenAlex is now our top-level brand (and retired the “OurResearch” name). Yesterday we unveiled our new logo. And today, we’re thrilled to launch the beta release of our fully-rewritten codebase (codenamed Walden)!

Walden is faster, bigger, and more maintainable–that means quicker bug fixes, more content, easier feature development, and a smoother experience all around.

Throughout October, we’ll be running Walden and the old system (Classic) side by side, with Classic remaining the default. On November 1 2025, Walden becomes default, and we’ll publish the last data snapshot from the old system (more info on timelines here).

How to test-drive Walden

Walden beta is already live in the API and UI so you can start exploring it right away!

In the UI: click the little 🧪 test-tube icon in the top right (or click here).
In the API: just add data-version=2 to your request, like this: https://api.openalex.org/works?data-version=2.
In OREO: Compare Classic to Walden using the OpenAlex Rewrite Evaluation Overview (OREO, yum). Using OREO you can see exactly what’s changed (good and bad), view known issues, and track our continuous improvements throughout our October beta

Just remember that it’s still in beta: there are lots of known issues and it’s changing every day. If you notice an that’s not already in OREO tests or known issues, report it here.

Key improvements

When you check it out, what should you expect to see? The best way to view a list of improvements is to check out the tests in OREO, especially work tests. But here’s a high-level overview:

150M+ new works: Newly indexed articles, books, datasets, software, dissertations, and more! You can explore just the newly added works here.
Better consistency: Unpaywall and OpenAlex will now always agree.
Better metadata: more citations, more language and retraction coverage, better keywords, more OA data.

Looking Ahead

The last year of rewriting OpenAlex was tough. We couldn’t move as fast as we wanted on new features, and support often lagged. But now we’re equipped to move fast without breaking things. Expect faster improvements, better support, and more ambitious features dropping in Q4, including:

Community curation: fix mistakes (like in Wikipedia) and see them reflected in days.
Vector search endpoint: find relevant works and other entities based on semantic similarity of free-form text
Download endpoint: Access PDF text from DOI or OpenAlex ID
Better funding metadata: New grants entity with better coverage of grant objects and linkages to research outputs and funders

This is a turning point for OpenAlex—and we’re excited to build the future of research infrastructure together with you. The engine’s rebuilt. The road ahead is wide open. Let’s go.

PS want to learn more about Walden? Come to our webinar Oct 7th at 10am Eastern. You can register to attend here.

A New Logo for OpenAlex

Posted on September 30, 2025September 30, 2025 by Jason

This is a big week for OpenAlex: yesterday we reorganized under the OpenAlex brand, and tomorrow we launch our completely rewritten codebase (beta). Today we launch our new logo!

The old logo was unique and conveyed the idea of building, which we loved. But was also visually complex, almost Escheresque; consequently, it didn’t scale down well, and it failed to convey the directness, boldness, and simplicity of our vision: to create a universal, open library of scholarly information.

So as we start a new chapter in OpenAlex, it’s a great time to also launch a new look.

You Bring the Color

We’re doubling down on black and white. That’s not just a design choice, it’s a statement of philosophy. OpenAlex is infrastructure. We’re the pipes under the city, not the flashy towers above it. We want to stay out of your way and let your projects, your creativity, your insights provide the color. You’ll see this commitment carry through in our website, which now leans harder into that clean, monochrome look.

A New Typeface

We’ve switched to Inter. Of course it’s open, just like us. Inter is modern and businesslike, but still human, readable, and approachable. Compared to Dosis (our old font), Inter is sharper, more confident, and more professional—while keeping the sense of openness that’s core to who we are. You’ll see Inter across our site from now on.

The Icon

The new icon is simple: a single, continuously curved outline forms three joined dots. Individually, dots are just dots—but when you connect them, something new emerges:

It’s an A for Alex—but sans crossbar, offering an open doorway in.
It’s a connected network—could be works and citations, coauthorships, or any of the billions of nodes and edges in the OpenAlex graph.
It’s a simplified water molecule.

Why Water?

Water has increasingly become part of our story of what OpenAlex is. Water’s simple but essential. We count on shared infrastructure to deliver it, quietly and reliably, cheaply, but not for free.

At OpenAlex we want to be the pipes under the scholarly city: infrastructure that delivers research information wherever it’s needed, at scale, for cheap. We’re here to support all the amazing things the research ecosystem is doing—quietly, reliably, and everywhere.

A Modern Library of Alexandria

The original Library of Alexandria aimed to collect all scholarly knowledge. OpenAlex means to carry that spirit forward in the digital age: building an open, connected, comprehensive graph of the world’s research.

Our new logo—an open A, a network, a molecule of water—is our way of putting that vision front and center. And if you see a pyramid in this logo, it’s not an accident, it’s homage.

The new logo is a reminder of what we’re building: a simple, open, essential infrastructure for the world’s research information: cheap, reliable, everywhere. That’s OpenAlex.

PS for logo nerds, other inspo: Vercel, the Banner of Peace, and iconic Paul Rand banger Westinghouse.

We’re now OpenAlex

Posted on September 29, 2025September 29, 2025 by Jason

For years, we’ve been working under the name OurResearch. That name sat at the top of our org chart, with three child projects under it: OpenAlex, Unpaywall, and Unsub.

Starting today, things are simpler: that org chart now has just one parent—OpenAlex—with Unpaywall and Unsub beneath it.

Why the change? Three reasons:

1. Fewer brands is clearer

We’re a tiny team, and having so many brands has always been confusing. People wondered: are we OurResearch (or Our Research), or OpenAlex, or Unpaywall, or something else? From now on, the answer is simple: we’re OpenAlex.

2. OpenAlex is what we do

More and more, OpenAlex is the center of our work. It’s our biggest project and the one that takes most of our time. And it’s also the data engine behind our other projects: Unpaywall and Unsub both run on OpenAlex data. In fact, with the launch of our fully rewritten OpenAlex codebase (codenamed Walden) this week, Unpaywall runs as a subroutine of the OpenAlex codebase.

So in a real sense, Unpaywall and Unsub are just friendly wrappers around OpenAlex. Improving OpenAlex improves them automatically.

And the name OpenAlex, with its homage to the ancient Library of Alexandria, captures our long-term vision to gather, organize, and make open all scholarly information.

3. New name, new start

Legally, nothing dramatic is happening—our official name has always been Impactstory, Inc., and “OurResearch” was just a DBA. But this moment is more than just a bookkeeping change.

This is a new chapter for us. The past year has been tough: not much visible progress, a lot of repaying technical debt, and a long slog to rewrite our entire codebase. But that rewrite launches (in beta) this week. And with a fresh codebase comes a fresh start: we get to focus harder, move faster, and pour our energy into making OpenAlex as comprehensive, accurate, and open as possible.

So yes, the name change simplifies things. But more importantly, it marks a new focus and a renewed commitment to our vision: building a universal library of scholarship.

And while we’ll continue to support Unpaywall and Unsub for now, we want to be transparent: OpenAlex is the future. As its functionality grows over the next year or two, Unpaywall and Unsub users will be able to meet their use-cases directly via OpenAlex. The rising tide of OpenAlex lifts all boats.

This week is about OpenAlex

This post is the first of three announcements:

Monday: our name change to OpenAlex (that’s today).
Tuesday: our new logo.
Wednesday: the beta launch our fully rewritten OpenAlex codebase.

When we say we’re focusing on OpenAlex, it’s not just words—we’re shipping, this week. And there’s more coming in Q4:

A new API endpoint to directly download PDFs and parsed PDFs.
A self-serve curation portal (think Wikipedia editing, but for scholarly metadata), where your changes go live in a day or two.
A new vector search API.
Improved funder coverage, thanks to our new Wellcome Trust grant.

After a year of rebuilding, we’ve finally got the tools and the focus we need start delivering more substantively on our vision: a universal, open library of scholarly information. We’re energized. We’re ready. We’re OpenAlex.

Unpaywall improvements: more gold, better green

Posted on August 28, 2025August 28, 2025 by Jason

We recently announced that we’d completely rewritten Unpaywall to make it faster, more accurate, and (most importantly) easier to fix and improve. We wanted to move Unpaywall from product to process, something we could continuously improve along with the community.

Well, we’ve been working hard on that over the last few months and here’s an update!

Better Gold coverage

By far the most common OA color is gold. In fact, based on our manual sampling, 25% of Crossref DOIs are gold OA, which is much higher than I’d expected and much higher than it used to be. (note: in this and all following stats we exclude component DOIs, which aren’t indexed in Unpaywall).

Coverage of gold is very tricky, because it’s all about the status of the work’s source, not the work itself. So we need very comprehensive coverage of sources, which is as hard as it sounds.

Of course there’s DOAJ which is fantastic but they only cover a small subset of gold OA journals. And even for those journals, DOAJ often only tells us that a given journal is fully OA since a certain date—we still need to figure out if the back catalog is open or not.

In recent weeks, we’ve finished several projects to add the “this is gold OA” flag to new journals:

We crawled 50k OJS journals, adding gold status to 17,000 of them (many thanks to Juan Pablo Alperin and Diego Chavarro for their help in getting a list of OJS journals!)
We marked 1,200 new journals gold using data from J-STAGE.
We marked 100 new journals gold using data from SciELO
We added gold status to several dozen journals from fully-OA publishers including including MDPI, Academic Journals, and Edorium.

We also modified our algorithm to assign gold instead of bronze when we know an article is OA, but we can’t figure out its source. Since gold is 2.5x more common than bronze, this will result in fewer errors overall.

Overall, this has made a big change in our gold coverage: now 19% of Unpaywall is gold, compared to 14% in May.

Green OA

We’ve made several changes in our green OA approach. These have not increased our total green percentage, but they have made our assignment of colors more consistent.

The rule for green has always been that if the best OA location is in a repository, it’s green. But, like gold, this is very dependent on us correctly describing the source as a repository. We’re very good at this for institutional repositories—but we’ve not been so good for preprint and data repositories, which are both much more common today then they were when we started Unpaywall.

Other changes

We fixed a bug causing us to list works published under the Elsevier User License as Hybrid. Since we don’t consider that to be an OA license, we moved these to bronze.

We marked SSRN as an open repository…it’s on the bubble but since all works are available free right away, for us it counts.

Results

The “ground truth” dataset is a random sample of 500 DOIs from Crossref. It excludes component DOIs and DOIs that don’t resolve. Each DOI is manually annotated by our team, which often includes doing lots of research on the journals and repositories that host the content. The definitions of oa_status colors come from here, which is in turn based on the original 2018 Unpaywall paper in PeerJ.

As you can see, we’re moving in the correct direction when it comes to gold and hybrid, green isn’t changing, and bronze coverage is going backwards a bit, although it’s still pretty close to the ground truth number. Our roadmap will prioritize green and gold for the next few months at least.

The future

The most important change for Unpaywall moving forward is the upcoming rewrite of OpenAlex, which will be gradually rolled out October-November of this year. That’s because when this rewrite is deployed, OpenAlex and Unpaywall will finally share the exact same codebase. Of course this will eliminate those pesky, embarrassing bugs where Unpaywall and OpenAlex disagree. But more importantly, it’ll link the large Unpaywall and OpenAlex communities, allowing everyone to improve both products together.

Even before that, though, we’ll be unveiling another exciting change: a new and improved curation portal. This will make it easier to fix article-level bugs in Unpaywall, including bugs that current curation solution doesn’t address (like missing PDF URLs and incorrect licenses). Even cooler, though it’ll allow users to fix source-level bugs, particularly fixing journals that should be marked gold, but aren’t. Although someday AI might let us automate this, for now, we think that active community curation is the only viable way to keep that data accurate and up to date. The unification of OpenAlex and Unpaywall codebases means that all these changes will propagate to both systems within days.

Ok, that’s all for now! Thanks for your support and as always, please get in touch with any suggestions or feedback!

We’re Rebuilding OpenAlex While It’s Running — Here’s What’s Changing

Posted on August 9, 2025August 9, 2025 by Jason

TLDR: Over the next five months we’re migrating OpenAlex to a new, better codebase; our schema won’t change, but some data (5%) will, and we’ll add over 50 million new works.

Why the change

OpenAlex was written in a big hurry, to fill the gap left when Microsoft Academic Graph disappeared. The code was rushed and hacky, and it shows:

Unpaywall and OpenAlex are awkwardly integrated and sometimes disagree.
Fixing bugs and adding features takes forever.
Adding new sources (eg DataCite) and entity types (grants) is nigh impossible.

The solution

We’re merging the codebases of Unpaywall and OpenAlex, and rebuilding everything atop Apache Spark hosted on Databricks. This stack is more modern, maintainable, and much much faster.

What’s changing

Our goal is to fix the code, not change the functionality or data. That said, you’ll inevitably notice some changes, intended and otherwise. It’s like swapping out the engine of your car—while you’re driving. Here’s what will change:

50+ million new works: we’re adding oodles of content from DataCite and institutional repositories, with more coming soon.
Unpaywall and OpenAlex will always agree (though they’ll stay separate apps).
You can edit our data: users will be able curate mistakes and see the curations applied within days.
Lots of small data changes across the whole dataset—for example, some works’ citation counts may grow or shrink, some works will get new OA links, etc. This is impossible to avoid, but our goal is to make sure nothing changes by more than 5%.
New topics algo: works created after the migration will use an updated algorithm but deliver similar results using the same taxonomy.
New keywords: works will get new keywords from a new algorithm based on our concepts algo.

What’s not changing

IDs will stay stable, so if you request a work/author/etc by OpenAlex ID you’ll get the same thing before and after the migration.
Functionality will stay the same in the API, web UI, and snapshot. It’ll all work like before.
The data schema won’t change.

Timeline

~~June 1: Unpaywall~~
- Unpaywall migration done
Oct 1: Beta launch
- Preliminary data from the new codebase can be used in the API by adding the data_version=2 param.
- Web-based comparison tool launches
- Beta snapshot of new data is published; you can explore this one.
Nov 1: Launch
- The API and UI serve data from the new codebase by default.
- Data from the old codebase deprecated but still available by adding in the API by using the data_version=1 param.
- Prod snapshot of new data is published; you should use this one
Dec 1: Completion
- Data from the old codebase is no longer available in API.
- Web-based comparison tool retired
- One last snapshot of the old data is published.

Stay up to date!

The rewrite is nearing release, but it’s still in very active development and we’re learning as we go. Some things might go worse than expected, some better. We’ll be making regular updates via the openalex-users Google Group, so sign up there if you want to stay up to date on everything.

Major Update to Unpaywall Database

Posted on July 29, 2025July 29, 2025 by Kyle Demes

We recently announced major changes to Unpaywall on our Unpaywall google group (https://groups.google.com/g/unpaywall) and via email to Unpaywall Premium Subscribers. A lot of folks aren’t on the group so we’re announcing here as well.

TL;DR
Unpaywall has migrated to a new codebase that helps us address data quality issues faster, and you may notice some changes.

The API is way faster → 10× faster API responses (avg 500 ms → 50 ms).
Some data has changed → About 23% of works saw data change, with about 10% seeing changes in oa_status (green, gold, etc) and 5% in is_oa (closed or open).
Overall accuracy is similar → Overall, precision remains constant. We have better recall of some Gold articles and worse detection of some Green articles.
Tiny schema changes→ Your scripts, API calls, and data feeds keep running, but two fields are now deprecated (oa_locations.evidence & oa_locations.updated)
Community curation → Users can now report and fix errors at unpaywall.org/fix.
Action required only if you host the full dataset locally (details below).

Why rewrite a perfectly good tool?

A decade ago we developed Unpaywall to:

make open access research in institutional repositories discoverable by users globally,
track open access behaviours and generate evidence for effective open access policies, and
raise the bar for open infrastructures by ensuring that the industry standard for determining open access status, was itself completely open.

We’re happy to report it has been very effective at achieving those goals:

Our Chrome and Firefox extensions are used by 800k monthly active users around the world,
Unpaywall sees an average of 200 API calls per second every second of the year,
Unpaywall now underpins every major open access monitoring and tracking initiative globally, and
Unpaywall has demonstrated an effective model for operating open research infrastructure.

Over the years, Open Access has become increasingly important to researchers, institutions, funders, and publishers. And steady changes over the years brought us to a publishing system that looks differently than the one we started in. At first, it was exceedingly rare for a publication that was open access to later become closed access. It was rare for publishers to make closed access works openly available for short times (like during COVID). And with the exception of embargo periods, it was rare for closed journals to later be made completely open.

All of these are common now, and at the scale of millions of publications. And publication landing pages aren’t just about providing the user with access to information– they also now collect information on users. As scholarly communication has evolved, it was clear that Unpaywall needed to evolve from a product into a process. And unfortunately, the code base that supported Unpaywall was struggling to adapted. With every change, we introduced new bugs and fixing each new bug kept creating more bugs. To continue delivering high quality open access metadata in an efficient way, we needed to start from scratch.

We spent the last year completely re-writing the code base for Unpaywall to make it:

faster;
easier to fix when it breaks; and
easier for users and publishers to curate.

On May 20, 2025 we launched the update. We have been working with our premium subscribers to implement the changes of their locally hosted databases that rely on Unpaywall. Most of our users switched to the new code base without even noticing– and that was intentional. Still, we think it is important for our users, especially those whose work depends on the Unpaywall database to understand these changes.

What didn’t change

Stable as ever	Details
Data format & schema	All keys stay the same (only the fields: oa_locations.evidence and oa_locations.updated are now marked “deprecated”).
API & data feed URLs	Zero downtime, same endpoints.
Aggregate metrics	10% of records saw a change in oa_status (i.e., color) and 5% saw a change in is_oa (open access vs. closed). Some changes were improvements and some were degradations, but overall precision remains the same

What did change

Better than before	How it helps you
Speed	Average API now returns in 50 ms, compared with 500ms before–10x speedup! ⚡
Accuracy	We detect more Gold OA, licenses, fresh OA URLs, and works that were once open access but are now closed. We detect less Green OA (but we’ll be able to improve that soon).
Curation UI	Users around the world can submit fixes via a web form; they go live in days.
Bulk Curation	Publishers can now directly submit to us bulk changes when their journals change from closed to open (or vice versa); they go live within 2 weeks.
Bug-fix velocity	Cleaner code = faster bug fixes.

Do you need to do anything?

Your setup	Required action
API-only	Nothing. You’re already on the new code and likely didn’t even notice
Data-feed mirror	Download our one-time “May 20 Snapshot” and overwrite your current database—too many small tweaks for a changefile.

Meet the new Curation Portal

We heard loud and clear from our users that they need to be able to fix open access metadata errors when they find them. And that’s why we developed a community curation pipeline for Unpaywall.

Found a record that still looks off? Head to unpaywall.org/fix, flag the issue, and we’ll merge your correction shortly (typically within 3 business days). Your expertise powers continual data quality improvements.

If you have ideas on how to improve the functionality of the curation user interface, please send them to brett@ourresearch.org.

Looking ahead

Community curation of Unpaywall will become increasingly important for overall database accuracy and fixing in Unpaywall will fix in all downstream users (Web of Science, Scopus, Dimensions, and more).
We will collaborate more closely with publishers directly to make large-scale changes associated with journal policy changes more quickly and accurately.
We will continue refining specific parts of our pipelines to increase their overall reliability, including better detection of OA status, journal OA status, license information, and fulltext links.
Users will see faster patch cycles for reported issues.
We will increase repository coverage and enhance linkage between publisher and repository versions.
Later this summer, we’ll be launching a full re-write of OpenAlex to bring the databases into closer alignment where they overlap (i.e., OA status metadata for publications with Crossref DOIs)

Thank you

We heard loud and clear from our communities of users that timely fixes of data quality issues is critical for them to be able to rely on Unpaywall. And we know that our response times slipped while we tackled this rewrite—thanks for sticking with us!

If you spot an error in the Unpaywall database that you would like to see fixed, the fastest way is to do that at unpaywall.org/fix. If you have other questions, send a note to support@unpaywall.org.

Here’s to a faster, cleaner, and ever-more-useful Unpaywall!

— The OurResearch Team

OpenAlex: 2024 in Review

Posted on December 24, 2024December 24, 2024 by Kyle Demes

As 2024 comes to a close, we’re taking the opportunity to reflect on the year behind us. And what a year it has been for OpenAlex!

It’s hard to believe that it was only one year ago when we launched the Beta of our web interface and the first University, The Sorbonne, announced that they were replacing their proprietary database with OpenAlex.

Since then the team has worked hard to meet the evolving needs of our communities of users. Below are some of the highlights of 2024.

Organization:

We received a 5-year grant from Arcadia totalling $7.5M to establish OpenAlex as a sustainably open index of the global research ecosystem
We received a 2-year grant from the Navigation Fund totaling $688k to enhance the OpenAlex user interface
We hired a Chief Operating Officer (Kyle Demes) and Senior Frontend Developer (Brett Lockspeiser)
Our Premium subscriptions exceeded our first year’s sustainability target by 25%

Data:

We started parsing fulltext PDFs to add more affiliation and reference metadata
We started matching references without DOIs (we now have 2.5B citations)
We added HAL as a primary source for new works
We started ingesting DataCite as a primary source. We now have 6.4M DataCite records (we’ll have them all in a few months)
We enhanced metadata accuracy for work type, publication year, author, institution, source, open access status, and more
Our data was adopted by three major University rankings:
Our data was featured in a Science News article examining the sustainability of APC feeds paid by researchers

Product:

We launched our Beta User Interface
We launched a new aboutness classification system (topics → subfields → fields → domains)
We launched new normalized citation metrics (field-weighted citation impact and citation percentiles) to facilitate comparison across fields and years.
We introduced user curation for affiliation, author, source, and work-level metadata and have already received more than 10k requests
We expanded our offerings of paid services to help us get to sustainability faster
We laid the foundation for an exciting new analytics product we’re looking forward to showing off early next year

Community (you):

Monthly users of OpenAlex.org have grown from 28k at the beginning of the year to 78k, now representing 440k visits per month!
Our first OpenAlex User Meeting was a huge success with 27 presentations from OpenAlex users in diverse organizations around the world
We attended 9 conferences to promote OpenAlex and engage with our user community globally: Research Analytics Summit, CARA, BRIC, ICSSI, Make Data Count, LIS, STI, SRAI, The Charleston Conference and were truly humbled to see presenters and vendors at every conference using OpenAlex data!
We launched a YouTube channel which now has 49 videos, 736 subscribers, and almost 25,000 views!
Over 500 publications mention or reference OpenAlex and that number grows daily!
We hosted an open call for our first Community Advisory Board where 50+ stellar nominees received almost 1,400 votes from the community– stay tuned for an announcement of results in early 2025

None of this would have been possible without all of you. So thank you! For your continued support, ideas, engagement, criticism, cheerleading, and collaboration. We’re looking forward to continuing to work together to build off these successes in 2025. Until then, Happy Holidays to you and yours.

Sincerely,

The OpenAlex Team

OurResearch receives $688k grant from Navigation Fund to enhance the OpenAlex User Interface

Posted on November 25, 2024November 25, 2024 by Kyle Demes

OurResearch is proud to announce a grant of $688,800 from The Navigation Fund to develop and launch an open, sustainable, web-based research intelligence (RI) module for the OpenAlex website. Our goal is to support expert finding, trend detection, and knowledge gap identification for researchers and research users. The RI module will serve as a map of the research landscape that’s easy to use for non-technical users, powerful for technical users, and supportive in helping all users increase their technical skills.

OpenAlex is the world’s first completely open and comprehensive index of the world’s research ecosystem. For the first time, everyone in the world has unrestricted access to a graph of the research ecosystem connecting hundreds of millions of scholarly outputs from thousands of fields across the globe to 100+ million authors from 100,000+ institutions. Hundreds of academic studies have already used OpenAlex to accelerate their research and to study science itself (link); universities and governments are adopting OpenAlex for their research intelligence needs, disrupting the established proprietary model (example); University rankings are switching to OpenAlex (example); and companies big and small around the world are using OpenAlex data to drive innovation previously not possible.

While we are thrilled by the early success of OpenAlex, we have noticed two significant barriers that are impeding more widespread adoption of OpenAlex: (1) many people (especially decision-makers) struggle to understand the promise of OpenAlex without seeing its potential first hand and (2) even when people can imagine how OpenAlex can benefit their work, most do not have the technical resources and capacity to create the desired insights from the openly available data.

With this grant, we have just hired a Senior Frontend Developer, starting December 2, 2024 to design and iteratively release new UI features. The expected UI enhancements will lead to better and more timely research-based outcomes in both enterprise and academia, including more productive collaborations, faster investigation of promising research fronts, and quicker time-to-market for new discoveries. Stay tuned for exciting updates to the OpenAlex web interface in 2025!

— — — —

OurResearch is a nonprofit that builds tools to help accelerate the transition to universal Open Science. Started at a hackathon in 2011, they remain committed to creating open, sustainable research infrastructure that solves real-world problems, like Unpaywall, Unsub, and OpenAlex.

The Navigation Fund is a 501(c)(3) nonprofit organization seeking to advance bold solutions to the world’s most urgent problems.