What is a citation?

I’ve been trying to find out what a “citation” is. At least the sort of citation that governs my future, and the funding of my department and institution. Just to reintroduce this subject, here’s Bill Hooker replying tomy post Impact Factors! Hirsch, Erdős and Pauling

According to Google Scholar, this is me [Bill]: [18,16,16,12,10,9,8,5,5,2,1,1,0,0], which yields an h-index of 7 if I understand the definition. According to the Wikipedia article, a “modestly productive” biomed researcher should have an h-index greater than their “years of service”. Even if those years start when I first published (1995), I’m not doing very well. But I didn’t need a fancy index to tell me that.

I think Bill’s maths is correct. But where do his figures come from? Google Scholar , which I also use because it’s Open, and easy and I don’t like using products from closed monopolistic commercial information providers. But is a Google citation count the same as a Web Of Knowledge citation? How do we know.
Recently I had to fill in my publications for the current UK Research Assessment Exercise. In this we were asked to give 4 + 2 research publications over the last 5 years. I selected the ones that I was proudest of – not necessarily the ones with the highest Google citations. I think that in this RAE there is still a lot of human assessment so I should give them something interesting to read. In the next one it will be done by robots, so we need to know what robots like.So research is not now about chasing the puzzles that “nature” sets us, but about guessing what the next metric is going to be. I suspect it’s rather like pop music – for many years the New Musical Express produced hit charts – the lists of how many people bought which records each week. The numbers were collected by the industry – there were presumably no audit processes – and showed which were the most popular records. Not the best, just the most popular. Presumably this is a complex function of quality and marketing. However the numbers had positive feedback – if something sold well it was likely to be played more often and people felt they needed to buy it. But, retrospectively, I doubt few musicologists would claim the numbers were perfect or even good measures of quality. The same is true of films – box office and expert judgement from 20 years on probably have a poorish correlation.
So will the research metrics be different? The music industry had two indicators at least (sheet music and record sales). Perhaps this is analogous to citations and downloads of research articles. So let me take one of my papers that I feel represents a part of my informatics research and scholarship:
Murray-Rust, Peter; Mitchell, John; Rzepa, Henry (2005) “Chemistry in Bioinformatics” BMC Bioinformatics 6 141
http://www.biomedcentral.com/1471-2105/6/141

It’s Open Access, so you can read it. BMC Bioinformatics publishes accesses (see, e.g. this months) A month after it was published BMC sent me a mail saying this was one of the highly accessed articles

From: BioMed Central Editorial
To: Peter Murray-Rust

Subject: Download statistics for your Open Access article
X-OriginalArrivalTime: 08 Sep 2005 06:30:15.0878 (UTC) FILETIME=[C65B6E60:01C5B43E]
Date: 8 Sep 2005 07:30:15 +0100

Title : Chemistry in Bioinformatics
Authors : Peter Murray-Rust, John B Mitchell and Henry S Rzepa
Journal : BMC Bioinformatics
Citation : 6:141
Dear Dr Murray-rust,
We thought you might be interested to know how many people have read your article since it was published:
Total accesses to this article: 1143
Access figures include full text, abstract and PDF downloads from the BMC Bioinformatics website.
These figures only reflect the accesses recorded on the journal’s website and the BioMed Central website and do not include those from PubMed Central or other sites that archive articles published by BioMed Central (see http://www.biomedcentral.com/info/libraries/archive). The overall access statistics for your article are therefore likely to be significantly higher.

(I can’t find the current access count as BMC only seems to keep the last year in its RSS).
But the paper only gets 4 citations in Google Scholar (probably at least two are self-citations), and presumably less in ISI (which I cannot access as it is Closed).
So there is clearly a wide variation between reading and citing. Citations have the advantage that they are in principle measureable (albeit I suspect with considerable imprecision, particularly in a changing world). Access cannot be easily audited.
So my questions are, please, (and I genuinely don’t know the answers)

  • How are citations counted?
  • Are different methods in widespread use?
  • If so are there agreed algorithms for converting between different metrics?
  • … or is a single authority accepted?
  • if there is a single authority, what auditing of the counting is available? Does the authority set the metrics themselves? Or is there a community process?

Great scientists will generally rise to the top (though I suspect metrics may make the path different from before). I am not a great scientist – in fact I am primarily a technologist at present. Egon reports that on ISI I get an h-score of 9 – fair enough (although it seems to have missed a lot of my papers – maybe there is a time cutoff).
If we are going to be based on metrics then it is a waste of time writing papers for humans to read. The Bioinformatics article above counts for nothing.
Hilaire Belloc (1870-1953) wrote:

When I am dead, I hope it may be said:
“His sins were scarlet, but his books were read.”

I am not a poet, but feel something like:

“My paper’s published (and it was invited)
Dont’t bother reading it, but please let it be cited”

This entry was posted in open issues. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *