behind the scenes: cleaning dirty data

Dirty Data.  It’s everywhere!  And that’s expected and ok and even frankly good imho — it happens when people are doing complicated things, in the real world, with lots of edge cases, and moving fast.  Perfect is the enemy of good.

Thanks http://www.navigo.com.au/2015/05/cleaning-out-the-closet-how-to-deal-with-dirty-data/ for the image

Alas it’s definitely behind-the-scenes work to find and fix dirty data problems, which means none of us learn from each other in the process.  So — here’s a quick post about a dirty data issue we recently dealt with 🙂  Hopefully it’ll help you feel comradery, and maybe help some people using the BASE data.

We traced some oaDOI bugs to dirty records from PMC in the BASE open access aggregation database.

Most PMC records in BASE are really helpful — they include the title, author, and link to the full text resource in PMC.  For example, this record lists valid PMC and PubMed urls:

and this one lists the PMC and DOI urls:

The vast majority of PMC records in BASE look like this.  So until last week, to find PMC article links for oaDOI we looked up article titles in BASE and used the URL listed there to point to the free resource.

But!  We learned!  There is sometimes a bug!  This record has a broken PMC url — it lists http://www.ncbi.nlm.nih.gov/pmc/articles/PMC with no PMC id in it (see, look at the URL — there’s nothing about it that points to a specific article, right?).  To get the PMC link you’d have to follow the Pubmed link and then click to PMC from there.  (which does exist — here’s the PMC page which we wish the BASE record had pointed to).

That’s some dirty data.  And it gets worse.  Sometimes there is no pubmed link at all, like this one (correct PMC link exists):

and sometimes there is no valid URL, so there’s really no way to get there from here:

(pretty cool PMC lists this article from 1899, eh?.  Edge cases for papers published more than 100 years ago seems fair, I’ve gotta admit 🙂 )

Anyway.  We found this dirty PMC data in base is infrequent but common enough to cause more bugs than we’re comfortable with.  To work around the dirty data we’ve added a step — oaDOI now uses the the DOI->PMCID lookup file offered by PMC to find PMC articles we might otherwise miss.  Adds a bit more complexity, but worth it in this case.

 

 

So, that’s This Week In Dirty Data from oaDOI!  🙂  Tune in next week for, um, something else 🙂

And don’t forget Open Data Day is Saturday March 4, 2017.   Perfect is the enemy of the good — make it open.

oaDOI integrated into Ex Libris link resolver SFX

There is a huge appetite for open data around open access publications!

We’ve been overwhelmed by the interest in the oaDOI API since its rollout a few months ago.  People have been doing large data collections for all sorts of reasons, including research studies, supplementing metadata in institutional repositories, and link resolving in library web pages.

We’ll showcase a few of these API uses in the next few weeks.  First up: the integration of oaDOI into the Ex Libris OpenURL link resolver SFX. This is a big rollout: SFX is used today by over 2,000 institutions worldwide. Thanks to Christine Stohn from Ex Libris for this summary:

The rise of publicly-accessible institutional repositories and the increase of individual open access articles in subscription journals opens up research to the wider population. However, it also presents a challenge for institutions to provide access to such publications. OpenURL link resolvers, used to connect users from a reference to accessible full text, provide the availability of articles on the journal issue level in the institution’s knowledgebase, but not on the article level itself. While this works well for articles in subscription and full open access journals, it is problematic for material where only individual articles are open access.

The oaDOI API can meet this need by directing users to a free copy of an article that they may otherwise not have access to. By connecting the oaDOI service to their OpenURL link resolver SFX, Ex Libris enables its users to automatically check if an open access copy is available and obtain the appropriate link to it.

SFX is a very widely implemented service, providing access to millions of articles, ebooks and other material. The oaDOI service fits well into SFX’ intermediary role to connect users with content. SFX is used today by over 2,000 institutions worldwide, opening up such research to a potentially huge user group.

More about SFX and Ex Libris can be found here: http://www.exlibrisgroup.com, http://www.exlibrisgroup.com/category/SFXOverview

 

How big does our text-mining training set need to be?

We got some great feedback from reviewers our new Sloan grant, including a suggestion that we be more transparent about our process over the course of the grant. We love that idea, and you’re now reading part of our plan for how to do that: we’re going to be blogging a lot more about what we learn as we go.

A big part of the grant is using machine learning to automatically discover mentions of software use in the research literature. It’s going to be a really fun project because we’ll get to play around with some of the very latest in ML, which currently The Hotness everywhere you look. And we’re learning a lot as we go. One of the first questions we’ve tackled (also in response to some good reviewer feedback) is: how big does our training set need to be? The machine learning system needs to be trained to recognized software mentions, and to do that we need to give it a set of annotated papers where we, as humans, have marked what a software mention looks like (and doesn’t look like). That training set is called the gold standard. It’s what the machine learning system learns from. Below is copied from one of our reviewer responses:

We came up with the number of articles to annotate through a combination of theory, experience, and intuition.  As usual in machine learning tasks, we considered the following aspects of the task at hand:

  • prevalence: the number of software mentions we expect in each article
  • task complexity: how much do software-mention words look like other words we don’t want to detect
  • number of features: how many different clues will we give our algorithm to help it decide whether each word is a software mention (eg is it a noun, is it in the Acknowledgements section, is it a mix of uppercase and lowercase, etc)

None of these aspects are clearly understood for this task at this point (one outcome of the proposed project is that we will understand them better once we are done, for future work), but we do have rough estimates.  Software mention prevalence will be different in each domain, but we expect roughly 3 mentions per paper, very roughly, based on previous work by Howison et al. and others.  Our estimate is that the task is moderately complex, based on the moderate f-measures achieved by Pan et al. and Duck et al. with hand-crafted rules.  Finally, we are planning to give our machine learning algorithm about 100 features (50 automatically discovered/generated by word2vec, plus 50 standard and rule-based features, as we discuss in the full proposal).

We then used these estimates.  As is common in machine learning sample size estimation, we started by applying a rule-of-thumb for the number of articles we’d have to annotate if we were to use the most simple algorithm, a multiple linear regression.  A standard rule of thumb (see https://en.wikiversity.org/wiki/Multiple_linear_regression#Sample_size) is 10-20 datapoints are needed for each feature used by the algorithm, which implies we’d need 100 features * 10 datapoints = 1000 datapoints.  At 3 datapoints (software mentions) per article, this rule of thumb suggests we’d need 333 articles per domain.  

From there we modified our estimate based on our specific machine learning circumstance.  Conditional Random Fields (our intended algorithm) is a more complex algorithm than multiple linear regression, which might suggest we’d need more than 333 articles.  On the other hand, our algorithm will also use “negative” datapoints inherent in the article (all the words in the article that are *not* software mentions, annotated implicitly as not software mentions) to help learn information about what is predictive of being vs not being a software mention — the inclusion of this kind of data for this task means our estimate of 333 articles is probably conservative and safe.

Based on this, as well as reviewing the literature for others who have done similar work (Pan et al. used a gold standard of 386 papers to learn their rules, Duck et al. used 1479 database and software mentions to train their rule weighting, etc), we determined that 300-500 articles per domain was appropriate. We also plan to experiment with combining the domains into one general model — in this approach, the domain would be added as an additional feature, which may prove more powerful overall. This would bring all 1000-1500 articles to the test set.

Finally, before proposing 300-500 articles per domain, we did a gut-check whether the proposed annotation burden was a reasonable amount of work and cost for the value of the task, and we felt it was.

References

Duck, G., Nenadic, G., Filannino, M., Brass, A., Robertson, D. L., & Stevens, R. (2016). A Survey of Bioinformatics Database and Software Usage through Mining the Literature. PLOS ONE, 11(6), e0157989. http://doi.org/10.1371/journal.pone.0157989

Howison, J., & Bullard, J. (2015). Software in the scientific literature: Problems with seeing, finding, and using software mentioned in the biology literature. Journal of the Association for Information Science and Technology (JASIST), Article first published online: 13 MAY 2015. http://doi.org/10.1002/asi.23538

Pan, X., Yan, E., Wang, Q., & Hua, W. (2015). Assessing the impact of software on science: A bootstrapped learning of software entities in full-text papers. Journal of Informetrics, 9(4), 860–871. http://doi.org/10.1016/j.joi.2015.07.012

Comparing Sci-Hub and oaDOI

Nature writer Richard Van Noorden recently asked us for our thoughts about Sci-Hub, since in many ways it’s quite similar to our newest project, oaDOI. We love the idea of comparing the two, and thought he had (as usual) good questions. His recent piece on Sci-Hub founder Alexandra Elbakyan quotes some of our responses to him; we’re sharing the rest below:

Like many OA advocates, we see lots to admire in Sci-Hub.

First, of course, Sci-Hub is making actual science available to actual people who otherwise couldn’t read it. Whatever else you can say about it, that is a Good Thing.

Second, SciHub helps illustrate the power of universal OA. Imagine a world where when you wanted to read science, you just…did? Sci-Hub gives us a glimpse of what that will look like, when universal, legal OA becomes a reality. And that glimpse is powerful, a picture that’s worth a thousand words.

Finally, we suspect and hope that SciHub is currently filling toll-access publishers with roaring, existential panic. Because in many cases that’s the only thing that’s going to make them actually do the right thing and move to OA models.

All this said, SciHub is not the future of scholarly communication, and I think you’d be hard pressed to find anyone who thinks it is. The future is universal open access.

And it’s not going to happen tomorrow. But it is going to happen. And we built oaDOI to be a step along that path. While we don’t have the same coverage as SciHub, we are sustainable and built to grow, along with the growing percentage of articles that have open access versions. And as you point out, we offer a simple, straightforward way to get fulltext.

That interface was not exactly inspired by SciHub, but rather I think an example of convergent evolution. The current workflow for getting scholarly articles is, in many cases, absolutely insane. Of course this is the legacy of a publishing system that is built on preventing people from reading scholarship, rather than helping them read it. It doesn’t have to be this hard. Our goal at oaDOI is to make it less miserable to find and read science, and in that we’re quite similar to SciHub. We just think we’re doing it in a way that’s more powerful and sustainable over the long term.

Collaborating on a $635k grant to improve credit for research software

We’re thrilled to announce Impactstory will be collaborating with James Howison at the University of Texas-Austin on a project to improve research software by helping its creators get proper credit for their work. The project will be funded by a three-year, $635k grant from the Alfred P. Sloan foundation.

Research software is an essential component of modern science. But the tradition-bound scholarly credit system does not appropriately reward the academic unsung heroes who create research software, putting further development of software-intensive science in jeopardy. Even when software is mentioned, the mentions are often informal, such as URLs in footnotes or just names in text. Howison, working with doctoral student Julia Bullard, found that 63% of mentions in a random sample of 90 biology articles were informal (Howison and Bullard, 2014).

We’re going to help fix that.

We’ll be working with James and his lab to make a huge database of every research software project used in every paper in the biomedicine, astronomy, and economics literatures. This database will filled in using a deep learning system that’ll automatically extract both formal and informal mentions of software, after being trained on a large, manually-coded gold standard dataset.

We’ll use this database to build and study three cool prototype tools:

  • CiteSuggest will analyze submitted text or code and make recommendations for normalized citations using the software author’s preferred citation,
  • CiteMeAs will help software producers make clear requests for their preferred citations, and
  • Software Impactstory will help software authors demonstrate the scholarly impact of their software in the literature.

We believe these tools will help transform the scholarly reward system into one where where software is a first-class research products, and its authors get full academic credit for their work. This in turn will support the software-intensive open science system we need for the future.

The project will build on our experience creating Depsy, a platform to track the scholarly impact of Python and R packages with an emphasis on dependencies, and on James’ extensive experience researching development in open source software and software in science. For lots more detail on the whole thing, check out the submitted proposal (edit Nov 9, 2016:  note this document is not a complete representation of the proposal, since the application and approval process also involved confidential back and forth with reviewers.  The reviewers added great comments and insight that we’re incorporating into the work as we go forward.)

Thank you, Sloan.  Thanks to Program Director Josh Greenberg for his continued advice and encouragement, and to the grant reviewers for well-informed and helpful feedback. And thanks especially to James, who had this idea in the first place, brought us on board, and has been a patient, good-natured, and ingenious collaborator in a lot of hard work already. We can’t wait to get started!

Introducing oaDOI: resolve a DOI straight to OA

Most papers that are free-to-read are available thanks to “green OA” copies posted in institutional or subject repositories.  The fact these copies are available for free is fantastic because anyone can read the research, but it does present a major challenge: given the DOI of a paper, how can we find the open version, given there are so many different repositories?screen-shot-2016-10-25-at-9-07-11-am

The obvious answer is “Google Scholar” 🙂  And yup, that works great, and given the resources of Google will probably always be the most comprehensive solution.  But Google’s interface requires an extra search step, and its data isn’t open for others to build tools on top of.

We made a thing to fix that.  Introducing oaDOI:

We look for open copies of articles using the following data sources:

  • The Directory of Open Access Journals to see if it’s in their index of OA journals.
  • CrossRef’s license metadata field, to see if the publisher has reported an open license.
  • Our own custom list DOI prefixes, to see if it’s in a known preprint repository.
  • DataCite, to see if it’s an open dataset.
  • The wonderful BASE OA search engine to see if there’s a Green OA copy of the article. BASE indexes 90mil+ open documents in 4000+ repositories by harvesting OAI-PMH metadata.
  • Repository pages directly, in cases where BASE was unable to determine openness.
  • Journal article pages directly, to see if there’s a free PDF link (this is great for detecting hybrid OA)

oaDOI was inspired by the really cool DOAI.  oaDOI is a wrapper around the OA detection used by Impactstory. It’s open source of course, can be used as a lookup engine in Zotero, and has an easy and powerful API that returns license data and other good stuff.

Check it out at oadoi.org, let us know what you think (@oadoi_org), and help us spread the word!

What’s your #OAscore?

We’re all obsessed with self-measurement.

We measure how much we’re Liked online. We measure how many steps we take in a day. And as academics, we measure our success using publication counts, h-indices, and even Impact Factors.

But we’re missing something.

As academics, our fundamental job is not to amass citations, but to increase the collective wisdom of our species. It’s an important job. Maybe even a sacred one. It matters. And it’s one we profoundly fail at when we lock our work behind paywalls.

Given this, there’s a measurement that must outweigh all the others we use (and misuse) as researchers: how much of our work can be read?

This Open Access Week, we’re rolling out this measurement on Impactstory. It’s a simple number: what percentage of your work is free to read online? We’d argue that it’s perhaps the most important number associated with your professional life (unless maybe it’s the percentage of your work published with a robust license that allows reuse beyond reading…we’re calculating that too). We’re calling it your Open Access Score.

We’d like to issue a challenge to every researcher: find out your open access score, do one thing to raise it, and tell someone you did. It takes ten minutes, and it’s a concrete thing you can do to be proud of yourself as a scholar.

Here’s how to do it:

  1. Make an Impactstory profile. You’ll need a Twitter account and nothing more…it’s free, nonprofit, and takes less than five minutes. Plus along the way you’ll learn cool stuff about how often your research has been tweeted, blogged, and discussed online.
  2. Deposit just one of your papers into an Open Access repository. Again: it’s easy. Here’s instructions.
  3. Once you’re done, update your Impactstory, and see your improved score.
  4. Tweet it. Let your community know you’ve made the world a richer, more beautiful place because you’ve made you’ve increased the knowledge available to humanity. Just like that. Let’s spread that idea.

Measurement is controversial. It has pros and cons. But when you’re measuring the right things, it can be incredibly powerful. This OA Week, join us in measuring the right things. Find your #OAscore, make it better, tweet it out. If we’re going to measure steps, let’s make them steps that matter.

 

Crossposted on the Open Access Week blog.

Data-driven decisions with Net Promoter Score


Today we’re releasing some changes in the way users sign up for Impactstory profiles, based on research we’ve done to learn more about our users. It’s a great opportunity to share a little about what we learned, and to describe the process we used to do this research–both to add some transparency around our decision making, and to maybe help folks looking to do the same sorts of things. There’s lots to share, so let’s get to it:

Meet the Net Promoter Score

As part of our journey to find product-market fit for the Impactstory webapp, we’ve become big fans of the Net Proscreen-shot-2016-09-15-at-7-26-10-pmmoter Score (NPS), an increasingly popular way to assess how much value users are getting from one’s product. It’s appealingly simple: we ask users to rank how likely they’d be to recommend Impactstory to a colleague, on a scale of 1-10, and why. Answers of 9-10 are Promoters, from 1-6 are Detractors. You subtract %detractors from %supporters and there’s your score.

It’s a useful score. It doesn’t measure how much users like you. It doesn’t measure how much they generally support the idea of what you’re doing. It measures how much you are solving real problems for real users, right now. Solving those problems so well that users will put their own reputation on the line and sing your praises to their friends.

Until we’re doing that, we don’t have product-market fit, we aren’t truly making something people want, and we don’t have a sustainable business. Same as any startup.

As a nonprofit, we’ve got lots of people who support what we’re doing and (correctly!) see that we’re solving a huge problem for academia as a whole. So they’ve got lots of good things to say to us. Which: yay. That’s fuel and we love it. But it can disguise the fact that we may not be solving their personal problems. We need to get at that signal, to help us find that all-important product-market fit.

Getting the data

We used Promoter.io to manage creating, sending, and collecting emails surveys. It just works and it saved us a ton of time. We recommend it.  Our response rate was 28%, which is we figure pretty good for asking help via email from people who don’t know you or owe you anything, and without pestering them with any reminders. We sliced and diced users along many dimensions but they all had about the same response rate, which improves robustness of the findings. Since we assume users who have no altmetrics will hate the app, we only sent surveys to users with relatively complete profiles (at least three Impactstory badges).

Once we had responses, we followed up using Intercom, an app that nicely integrates most of our customer communication (feedback, support, etc). We got lots more qualitative feedback this way.

Once we had all our data, we exported the results into a spreadsheet and had us some Pivot Table Fun Time. Here’s the raw data in Google Docs (with identifying attributes removed to protect privacy) in case you’d like to dive into the data yourself.

Finally, we exported loads of user data from our Postgres app database hosted on Heroku. All that got added into the spreadsheet and pivot tables as well.

Here’s what we found

The overall NPS is 26, which is not amazing. But it is good. And encouragingly, it’s much better than we got when we surveyed users about our old, non-free version in March. Getting better is a good sign. We’ll take it.

Users who have made profiles in both versions (new and old) seem to agree. The overall NPS for these users was 58, which is quite strong. In fact, users of the old version were the group with the highest NPS overall in this survey. Since we made a lot of changes in the new app from the old, this wouldn’t have to have been true. It made us happy.

But we wanted more actionable results. So we sliced and diced everyone into subgroups along several dimensions, looking for features that can predict extra-high NPS in future sign-ups.

We found four of these predictive features. As it happens, each predictor changes the NPS of its group by the same amount: your NPS (on average) goes from 15 (ok) to 35 (good) if you

  1. have a Twitter account,
  2. have more than 20 online mentions of some kind (Tweets, Wikipedia, Pinterest, whatever) pointing to your publications,
  3. have made more than 55% of your publications green or gold open access, or
  4. have been awarded more than 6 Impactstory badges.

Of these, (4) is not super useful since it covaries a lot with numbers of mentions (2) and OA percentage (3); after all, we give out badges for both those things. A bit more surprisingly, users who have Twitter are likely to have more mentions per product, and less likely to have blank profiles, meaning Feature 1 accounts for some of the variance in Feature 2. So simply having a Twitter account is one of our best signals that you’ll love Impactstory.

Surprisingly, having a well-stocked ORCID profile with lots of your works in it doesn’t seem to predict a higher NPS score at all. This was unexpected because we figured the kind of scholcomm enthusiasts who keep their ORCID records scrupulously up-to-date would be more likely to dig the kind of thing we’re doing with Impactstory. Plus they’d have an easier and faster time setting up a profile since their data is super easy for us to import. Good to have the data.

About 60% of response included qualitative feedback. Analysing these, we found three themes:

  • It should include citations. Makes sense users would want this, given that citations are the currency of academia and all. Alas they ain’t gonna get it, not till someone comes out with a open and complete citation database. Our challenge is to help users be less bummed about this, hopefully be positioning Impactstory as a complement to indexes like Google Scholar rather than a competitor.
  • It’s pretty. That’s good to hear, especially since we want folks to share their profiles, make them part of their online identity. That’s way easier if you think it looks sharp.
  • It’s easy. Also great to hear, because the last version was not very easy, mostly as a result of feature bloat. It hurt to lose some features on this version, so it’s good to see the payoff was there.
  • It puts everything all in one place.  Presumably users were going to multiple places to gather all the altmetrics data that Impactstory puts in one spot. 

Here’s what we did

The most powerful takeway from all this was that users who have Twitter get more out of Impactstory and like it more. And that makes sense…scholars with Twitter are more likely be into this whole social media thing, and (in our experience talking with lots of researchers) more ready to believe altmetrics could be a useful tool.

So, we’ll redouble our focus on these users.

The way we’re doing that concretely right away is by changing the signup wizard to start with a “signup with Twitter” button. That’s a big deal because it means you’ll need a Twitter account to sign up, and therefore excludes some potential users. That’s a bummer.

But it’s excluding users who, statistically, are among the least likely to love the app. And it’s making it easier to sign up for the users that are loving Impactstory the most, and most keen to recommend us. That means better word of mouth, a better viral coefficient, and a chance to test a promising hypothesis for achieving product-market fit.

We’re also going to be looking at adding more Twitter-specific features like analysing users’ tweeted content and follower lists. More on that later.

To take advantage of our open-access predictor, we’ll be working hard to reach out to the open access community…we’re already having great informal talks with folks at SPARC and with the OA Button, and are reaching out in other ways as well. More on that later, too.

We’re excited about this approach to user-driven development. It’s something we’ve always valued, but often had a tough time implementing because it has seemed a bit daunting. And honestly, it is a bit daunting. It took a ton of time, and it takes a surprising amount of mental energy to be open-minded in a way that makes the feedback actionable. But overall we’re really pleased with the process, and we’re going to be doing it more, along with these kinds of blog posts to improve the transparency decision-making. Looking forward to hearing your thoughts!

Now, a better way to find and reward open access

There’s always been a wonderful connection between altmetrics and open science.

Altmetrics have helped to demonstrate the impact of open access publication. And since the beginning, altmetrics have excited and provoked ideas for new, open, and revolutionary science communication systems. In fact, the two communities have overlapped so much that altmetrics has been called a “school” of open science.

We’ve always seen it that way at Impactstory. We’re uninterested in bean-counting. We are interested in setting the stage for a second scientific revolution, one that will happen when two open networks intersect: a network of instantly-available diverse research products and a network of comprehensive, open, distributed significance indicators.

So along with promoting altmetrics, we’ve also been big on incentives for open access. And today we’re excited that we got a lot better at it.

We’re launching a new Open Access badge, backed by a really accurate new system for automatically detecting fulltext for online resources. It finds not just Gold OA, but also self-archived Green OA, hybrid OA, and born-open products like research datasets.

A  lot of other projects have worked on this sticky problem before us, including the Open Article Gauge, OACensus, Dissemin, and the Open Access Button. Admirably, these have all been open-source projects, so we’ve been able to reuse lots of their great ideas.

Then we’ve added oodles of our own ideas and techniques, along with plenty of research and testing. The result? Impactstory is now the best, most accurate way to automatically assess openness of publications. We’re proud of that.

And we know this is just the beginning! Fork our code or send us a pull request if you want to make this even better. Here’s a list of where we check for OA to get you started:

  • The Directory of Open Access Journals to see if it’s in their index of OA journals,
  • CrossRef’s license metadata field,  to see if the publisher has uploaded an open license.
  • Our own custom list DOI prefixes, to see if it’s in a known preprint repo
  • DataCite, to see if it’s an open dataset.
  • The wonderful BASE OA search engine to see if there’s a Green OA copy of the article.
  • Repository pages directly, in cases where BASE was unable to determine openness.
  • Journal article pages directly, to see if there’s a free PDF link (this is great for detecting hybrid OA)

What’s it mean for you? Well, Impactstory is now a powerful tool for spreading the word about open access. We’ve found that seeing that openness badge–or OH NOES lack of a badge!–on their new profile is powerful for a researcher who might otherwise not think much about OA.

So, if you care about OA: challenge your colleagues to go make a free profile and see how open they really are. Or you can use our API to learn about the openness of groups of scholars (great for librarians, or for a presentation to your department). Just hit the endpoint http://impactstory.org/u/someones_orcid_id to find out the openness stats for anyone.

Hit us up with any thoughts or comments, and enjoy!

Why researchers are loving the new Impactstory

We put our heart and soul into the new Impactstory and have been on pins and needles to hear what you think.  Well it’s been a week and the verdict is in — we’re hearing that the new version is awesome, fantastic, and truly excellent, a home run and must-have–an academic profile that’s exciting and relevant.

And so much more. So much more, in fact, that we wanted to a little break from the frenzied responding, bugfixing, and feature-launching we’ve been doing this week and summarize a bit of what we’ve heard.

What do you like?

A lot of users have appreciated that it now takes seconds and is super easy to set up a profile that’s blazing fast and smooth to use: it’s instant insights about your research.

Unlike speed, beauty is in the eye of the beholder–but our beholders seem delightfully agreed that our new look is great, great, great.  Whether users are calling it fresh or beautifully crafted, or sleek or smooth or snazzy, everyone seems to agree that the new version looks awesome, it looks pretty damn awesome. And we are pretty thrilled to hear that.

They’re enjoying that it’s got some fun 🙂 And, we’re not surprised to hear that people like the new price point of Free, making it easier to recommend to others.  

What’s it good for?

Impactstory helps researchers find impacts of their work beyond just citations. People have found mentions they didn’t know about on Wikipedia, discussion in cool blog posts, and reviews on Faculty of 1000. And not just numbers, but impact across the globe. Not just numbers but connecting with people: for instance user Peter van Heusden tweeted, “Using @Impactstory I discovered someone who is consistently promoting work I’m involved in, but who I had no idea existed!”

All this amounts to more than just a lovely ego boost (although it’s that too!). People are telling us that it’s motivating them to adopt more Open Science practices like uploading research slides to a proper repository, getting an ORCID, adding works to their ORCID profile, and celebrating their non-paper publications.

How are you using it?

People are already sending their Impactstory profiles to their funders, and their funders are loving them.  Researchers have added their new profile to their CV, and are planning on using Impactstory data to define innovative ‘pathway to impact’ for UK grants and in tenure and promotion packets.

Folks are including it in workshops.  And even better — building things with our open data! Check out the ferret.io plugin, it rolled out impactstory support this week and it’s really cool 🙂

What have we been doing?

We’ve made a bunch of changes this week in response to your feedback:

  • imports all your publications, not just DOIs.  Everything on your ORCID profile now displays in your Impactstory profile, and we’re working on getting more openness and altmetrics data
  • twitter integration
    • connecting twitter updates your profile pic so you don’t have to fight with gravatar
    • you don’t have to enter email manually–even faster signup
    • we’ll be using your twitter feed for achievements in the future
  • there’s a new Open Sesame achievement
  • we changed the scores at the top of the profile beside your picture; they are now counts of your achievements
  • the achievements and the import process are better documented
  • we rolled out dozens of smaller features, usability enhancements, and bugfixes.

What’s next?

We’re on our way to the FORCE16 conference this week.  We’ll be rolling the feedback from the conference along with your continued feedback into continued improvement to the app.

And you?  Join in with everyone showing off their profile, spread the word (this is how we will grow), and if you don’t have a profile, get one, and tell us what you think!

Finally, thanks.

Finally, we’d like to thank the hundreds of passionate people who have helped us with money and with moral support along the way, from our early days till now. It’s safe to say the new Impactstory is a big hit.  It’s our hit, together.