4 reasons why Google Scholar isn’t as great as you think it is

These days, you’d be hard-pressed to find an academic who doesn’t think that Google Scholar Profiles are the greatest thing since sliced bread. Some days, I agree.

Why? Because my Google Scholar Profile captures more citations to my work than Web of Knowledge or Scopus, automatically adds (and tracks citations for) new papers I’ve published, is better at finding citations that appear in non-English language publications, and gives me a nice fat h-index. I’m sure you find it valuable for similar reasons.

And yet, Google Scholar is still deeply flawed. It has some key disadvantages that keep it from being as awesome as most imagine that it is.

In this post, I’m going to do some good ol’ fashioned consciousness-raising and describe Google Scholar Profiles’ limitations. And in our next post, I’ll share tips I’ve learned for getting the most out of your Google Scholar Profile, limitations be darned.

1. Google Scholar Profiles include dirty data

Let’s begin with the most basic element of your Profile: your name. If your name includes diacritics, ligatures, or even apostrophes, Google Scholar may be missing citations to your work. (Sorry, O’Connor!) And if you have a common name, it’s likely you’ll end up with others’ publications in your Profile, which you are unfortunately responsible for identifying and removing. (We’ll cover how to do that in our next post.)

Now, what about the quality of citations? Google Scholar claims to pull citations from anywhere on the scholarly web into your Profile, but their definition of “the scholarly web” is less rigorous than many people realize. For example, our co-founder, Heather, has citations on her Google Scholar Profile for a Friendfeed post. And others have found Google Scholar citations to their work in student handbooks and LibGuides–not the worst places you can get a cite from, but still: Nature they ain’t.

Google Scholar citations are also, like any metric, susceptible to gaming. But whereas organizations like PLOS and Thomson Reuters’ Journal Citation Index will flag and ban those found to be gaming the system, Google Scholar does not respond quickly (if at all) to reports of gaming. And as researchers point out, Google’s lack of transparency with respect to how data is collected means that gaming is all the more difficult to discover.

The service also misses citations in a treasure-trove of scholarly material that’s stored in institutional repositories. Why? Because Google Scholar won’t harvest information from repositories in the format that repositories across the world tend to use (Dublin Core).

Google Scholar Profile data is far from perfect, but that’s a small problem compared to the next issue.

2. Google Scholar Profiles may not last

Remember Google Reader? Google has a history of killing beloved products when the bottom line is in question.  It’s not exaggerating to say that Google Scholar Profiles could literally go away at any moment.

To me, it’s not unlike the problem of monoculture in agriculture. Monoculture can be a good thing. For those unfamiliar with the term, monoculture is when farmers identify the most powerful species of a crop–the one that is easiest to grow and yields the best harvest year after year–and then grow that crop exclusively. Google Scholar Profiles were, for a long time, the most easy to use and powerful citation reports available to scholars, and so Google Scholar has become one of the most-used platforms in academia.

But monoculture is also risky. Growing only one species of a crop can be catastrophic to a nation’s food supply if, for example, that species were wiped out by blight one year. Similarly, academia’s near-singular dependence on Google Scholar Profile data could be harmful to many if Google Scholar were to be shelved.

3. Google Scholar Profiles won’t allow itself to be improved upon

Other issues aside, it’s worth acknowledging that Google Scholar Profiles are very good at doing one thing: finding citations on the scholarly web. But that’s pretty much all they do, and Google is actively preventing anyone else from improving upon their service.

It’s been pointed out before that the lack of a Google Scholar API means that no one can add value to or improve the tool. That means that services like Impactstory cannot include citations from Google Scholar on Impactstory, nor can we build upon Google Scholar Profiles to find and display metrics beyond citations or automatically push new publications to Profiles. Based on the number of Google Scholar-related help tickets we receive, this lack of interoperability is a major pain point for researchers.

4. Google Scholar Profiles only measure a narrow kind of scholarly impact

Google Scholar Profiles aren’t designed to meet the needs of web-native scholarship. These days, researchers are putting their software, data, posters, and other scholarly products online alongside their papers. Yet Google Scholar Profiles don’t allow them to track citations–nor any other type of impact indicator, including altmetrics–to those outputs.

Google Scholar Profiles also promote a much-maligned One Metric to Rule Them All: the h-index. We’ve already talked about the many reasons why scholars should stop caring about the h-index; most of those reasons stem from the fact that h-indices, like Google Scholar Profiles, aren’t designed with web-native scholarship in mind.

Now that we’re clear on the limitations of Google Scholar Profiles, we’ll help you overcome ‘em by sharing 7 essential workarounds for your Google Scholar Profile in tomorrow’s post. Stay tuned!

13 thoughts on “4 reasons why Google Scholar isn’t as great as you think it is

  1. Marc says:

    One cannot export the data, but for one’s list of papers, which is too basic for any citation network analysis or statistics, more generally. One cannot export list of the papers that cite one’s paper, not saying someone else’s papers, although all this is accessible via the browser. So, Google computes influence, progress, identifies small circles of mutual citation (whether for good scientific reasons or for artificial reciprocal of boosting of h-index).

    No wonder, Google is a private company and not a public service nor a public good. It aims at generating traffic.

    • Great points, Marc! I hadn’t thought about the lack of open data from the angle of website traffic. I wouldn’t be surprised if that’s at least part of the reason why they don’t allow authors to download/export more of their information…

  2. regman says:

    I find this rather hilarious. Every criticism you level at google applies to yoursite as well!

    Particularly egregious is your complaint about google’s less than scholarly sourcing when, from what I can see, your site is based on the idea that the number of times a paper is tweeted is as important as traditional metrics like citations!


    • Hi there,

      The point of Impactstory–and altmetrics overall–isn’t that tweets are better than citations (or vice versa). It’s that each type of metric and data source can tell us things about where our research has made a difference. And more data can help us as researchers, whether for the purposes of outreach to the communities that could benefit the most from our studies, making a case for “broader impacts” when applying for grants to the NIH or NSF, going up for tenure or jobs, and so on. I’d encourage you to learn more about altmetrics overall here: http://altmetrics.org/manifesto/

      As for the weaknesses we’ve proposed–I’d disagree that we share many (if any) similarities with Google Scholar in those areas. Impactstory is designed to:

      1) provide impact metrics that are clean, auditable, and meaningful;
      2) meet the needs of researchers first and foremost (as evidenced by our non-profit status). We thus aren’t subject to the same “will we be here next week?” worries that I think Google Scholar has (Google being well-known for killing beloved products with little warning) (though, to be fair, non-profits don’t stick around forever, either–but if we were to ever go away, researchers’ profile data would be portable and reusable on other platforms–something GScholar unfortunately doesn’t offer);
      3) we’re open source and offer open data via API, thereby allowing others to improve upon our technology and reuse their profile data in any way they please; and
      4) through the variety of impact metrics we offer, we allow scholars to better understand their diverse impacts, among the many types of scholarly outputs they create (software, data, articles, etc).

      That said, we’re not perfect, either! We’d welcome any feedback you have. Feel free to email us at any time at team@impactstory.org.


  3. Irene says:

    One thing I dislike about Google Scholar is that people actively claim papers by others to bloat their h-index. A colleague of mine actively did this. Her scholar count went up by >100 overnight, and she had added a paper by someone else with the same initials (but not even the same name). She has an uncommon name, and I’ve never seen Scholar give someone credit for a “same initials” but “different name paper”. To double check, I created a scholar account with her publishin name to see if Google Scholar did this inadvertently. Nope.

    Thus, she did this on purpose. Most people won’t look closely enough to notice a few extra papers (but doubling her citation count and her h-index).

    People are not honest, especially not when they are looking for jobs. There shoudl be some way to force people to change when they’ve gotten credit for something they didn’t write!

    • Hmm, I haven’t seen much of this type of gaming. I’m actually surprised your colleague did this, given the fact that others can dig down into the data (read the papers and see the true authors list), which as you point out makes it a big risk to her professional reputation. Guess that’s all the more reason to expose the underlying, qualitative data, isn’t it?

  4. surendrancherukodan says:

    I think, with some limitations Google Scholar is doing wonderful task to the academic community. While Scopus and Web of Science are paid databases, Google Scholar does it for free. It is free and open. Its robots search and find articles published anywhere in the world which have a web presence and inform us the related articles/citing articles. The Google Metrics is also useful to identify top publications in all fields of science.

  5. Eugene Michael says:

    I feel great that I learned this here before I do much with this scholarly business! The information looks good though there is an element of biz competition….Thank you!

Leave a Reply to Stacy Konkiel Cancel reply