What's so wonderful about citations?

Peter Suber reports:

20:34 06/07/2007, Peter Suber, Open Access News
Iain Hrynaszkiewicz, Open access article on consensus definition of acute renal failure has been accessed more than 100,000 times, BioMed Central blog, July 6, 2007. Hrynaszkiewicz is BMC’s in-house Editor of Critical Care. Excerpt:

The most highly accessed article on BioMed Central’s most viewed articles page recently surpassed 100,000 accesses.
Bellomo et al.’s article, published in Critical Care in 2004, presented the first consensus definition of acute renal failure and followed a two day conference of the Acute Dialysis Quality Initiative (ADQI) Group. It has been cited more than 90 times according to both Google Scholar and Scopus.
These impressive access and impact statistics demonstrate the effectiveness with which important research articles can be disseminated, thanks to the wide-reaching visibility achieved by open access. Evidence continues to accumulate that open access research has an advantage in terms of being rapidly read and widely cited by peers….

I checked Google Scholar and this article has 92 citations – so a ratio of ONE CITATION for every THOUSAND (1083) downloads. I think we can be reasonably sure the most of the downloads are genuine (and not robots). (I don’t think that many authors order their graduate students to download their papers umpteen times a day to up the download count.) The very fact that the metric-weenies don’t count downloads would suggest that the download metric is genuine.
So how about some confirmatory evidence? Well, I was a minor co-author on an important BMC article this year. Two weeks ago we were told we had got 6000 downloads. In 4 months. Wow! So we should have 6 citations. Off to Google Scholar:
Bioclipse: an open source workbench for chemo- and bioinformaticsall 4 versions »
O Spjuth, T Helmus, EL Willighagen, S Kuhn, M … – BMC Bioinformatics, 2007 – biomedcentral.com
Bioclipse: An open source workbench for chemo- and bioinformatics Page 2. Bioclipse:
An open source workbench for chemo- and bioin- formatics
Cited by 1Related ArticlesView as HTMLWeb Search

only ONE. A ratio of SIX THOUSAND downloads per citation. So if we average the numbers we get somewhere around 1115 downloads per citation. That makes me feel better on those low citation counts for some of my papers.
Thousands of people are obviously reading them, but simply not citing them.
Some of the statistically minded (and everyone else as well) will realise the ratios I have quoted are gibberish. Of course. So are citations. And almost everything else. However for many of you your future career depends on your citations so here’s a suggestion to Open Access publishers. Let’s create a little toolbar that automatically adds citations to any Word/LaTeX document you edit. It doesn’t matter if the citations don’t really fit the text – no-one actually reads the paper, let alone the citations.  Some mutual backscratching could easily enhance the citations count. Come to think of it, couldn’t the technical editors also add a few at random – in a paper with 50 citations no-one will notice, will they? And in any case a citation doesn’t mean the paper is a good one. In one paper (Closed access so I won’t point to it) I referred to several papers whose supplemental data was scientifically disgraceful (the worse hamburger PDF you will ever see). But it will have boosted several peoples’ citation counts!
Note, of course, that you can only do this exercise with publishers which announce download counts. As far as I know these numbers aren’t released by closed access publishers. (I can’t think why).
I’m not saying there are better ways – there probably aren’t. If we make downloads a metric, then people will try to distort them, just But let’s not take this as seriously as we do.
Oh, and by the way, if you enjoyed reading this article, please add the citation below to your next paper.

Bioclipse: an open source workbench for chemo- and bioinformatics
Ola Spjuth*, Tobias Helmus, Egon L Willighagen, Stefan Kuhn,
Martin Eklund, Johannes Wagener, Peter Murray-Rust,
Christoph Steinbeck and Jarl ES Wikberg
BMC Bioinformatics 2007, 8:59 doi:10.1186/1471-2105-8-59

no-one will know whether it’s relevant or not. And, if you feel guilty, just download Bioclipse anyway. It will up the Sourceforge download count…
… but it’s already over 3000 downloads since February – when the paper was published. Now that figure THREE THOUSAND is one I DO believe in.

This entry was posted in data, open issues. Bookmark the permalink.

9 Responses to What's so wonderful about citations?

  1. Bill says:

    I don’t understand — am I missing a joke somewhere? The suggestion to add meaningless citations is — bizarre.
    If the public is going to spend money on research, they are entitled to some mechanism for measuring and improving value for money — and the same mechanism is of clear benefit to research as a whole. Download and citation based metrics, especially those which will be possible with widespread OA, are the best we have. At least, I don’t know of any better ones — can you suggest any?

    The very fact that the metric-weenies don’t count downloads would suggest that the download metric is genuine.

    Which metric-weenies are those? The ones I know about most certainly do count downloads.

  2. pm286 says:

    (1) Oh dear – yes it was a joke. I’m told the British sense of humour doesn’t travel. I’ll add “joke” tags next time.
    Which metric-weenies are those? The ones I know about most certainly do count downloads.
    The people, such as faculty boards, funders and the (UK) Research Assessment Exercise who make financial and political decisions based on citations. These are either direct (e.g. per paper) or averaged as in “Impact Factor”. Scientists are urged to publish in high-impact journals – I never hear “publish in highly-downloaded journals”.
    But if you know of bodies who do make decisions on downloads, I’d be delighted to know. Presumably the metrics are provided by each publisher. Does anyone certify them?

  3. Bill says:

    Ah. Oops. Mea culpa.
    We’re talking about different sets of metric-weenies. You mean decision-makers, I mean people like me, who think that improving metrics will improve science. I don’t — unfortunately — know of any decisions (funding, promotion, whatever) being based on downloads. I was thinking of the various efforts to improve metrics (Citebase, Eigenfactor, etc etc) using OA.

  4. pm286 says:

    (3) Thanks Bill, we converge. Yes, I am in the same camp as you. I believe in citations+downloads+usage_of_data+revyu (by blogosphere), etc. For example it would be nice to know if someone actually used your molbio procedure, data analysis , etc. etc. Did it work? yes = plus; no = minus. At present both of these give a positive review. We should be exploring rich new metrics.

  5. pm286 says:

    (4) Many thanks Matt (and also for spotting that my suggestion was a jest). I am sad to see that Impact boosting is currently practised.
    I am delighted to see the call for a usage factor. There are many things we could do here. Count how often a program is run – measure how often a procedure is criticized – get community review of papers (as indeed the blogosphere is starting to do). I think a publisher whoc encouraged new metrics would be welcome by at least the enthusiasts in the community.

  6. I think the number of downloads for the publishers is certified by the COUNTER project (here is a post about it). I have tried to test the correlation between citations in Connotea and in blogs with incoming citations from ISI. From the small scale test both work reasonably well, in that social bookmarked papers and paper linked from blog posts tend to be highly cited papers. I only noticed recently the download information on BMC. I want to have a look at that next. What I am thinking currently is that we should take advantage that we have the DOI system and have a way to add information to a DOI that conveys the value via these soft peer review information. Hopefully publishers should agree on a standard to convey this information (ratings, comments, downloads, etc). As readers we should be able to query (via DOI) this information.This should also difuse the problem that open access journals will be viewed in several different places. We can then query also credited sites like PubmedCentral and even pubmed abstracts and aggregate the information.
    We might then hopefully show how aggregated soft peer review information correlates with perceived impact of a paper. Then convince the funding bodies 🙂

  7. Neil says:

    Great post. Robots – do publishers take account of search engine crawlers when they analyse their web logs? Or is this covered by Pedro’s reference to COUNTER?

  8. Interesting post and I like the fact that more and more people find the notion of soft peer review promising. You may want to take a look at my article on the Academic Productivity blog on this topic:
    Soft peer review? Social software and distributed scientific evaluation.

Leave a Reply

Your email address will not be published. Required fields are marked *