What’s wrong with Scholarly publishing? Measuring quality

I’m starting all these posts with “What’s wrong with Scholarly publishing?”. That’s because I am getting feedback, which includes young researchers who are following them, and libraries/academics who wish to use them as resources material. I’ll note that I do not put enough effort into creating hyperlinks – it takes a surprising amount of effort and I’d like to see better tools (e.g. Google or Wikpedia searches for concepts).

Blogs have a mind of their own – I didn’t know a week ago I would be writing this post – and this topic has grown larger than I anticipated. That’s partly because I think it takes us slightly closer to the tipping point – when we see a radical new approach to scholarly publishing. I’m not expecting that anything is directly attributable to this blog. But it all adds up and acts as an exponential multiplier for change.

I’ll be writing later on the dysfunctionalities of the publishing system – “Monsters of the Scholarly Id” – that we academics have unwittingly (but not blamelessly) created. These MOTSI are set to destroy us and I’ll look at each in detail and also ask for y/our input. If you want to anticipate, try today’s homework:

“What is the single raison d’etre of the Journal Impact Factor in 2011?”

Feel free to comment on this blog – I’ll give my analysis later – perhaps in 2 days.

Meanwhile, since we shall come later and in depth to measurement of quality in SchPub, let’s see how we measure quality and utility objectively.

For those of you who don’t know him, read Ben Goldacre’s column (http://www.badscience.net/ ). It’s more than an attack on bad science – it’s also a simple and compelling account of how to measure things accurately and well. How to sho whether product X has an effect. Whether A is better than B (and what better means). Whether government policies work.

From a Kryptonite ad: “70% of readers of WonderWomanWeekly say that Kryptonite gave their hair more life and volume”. You’ll all recognize this as marketing crap. Almost everything is untestable. How was the survey carried out (if indeed it was)? Did Kryptonite sponsor the survey? What does “volume” mean? (It’s not determined by sticking your head in a measuring cylinder?) It’s a subjective “market performance indicator”. What does “life” mean (for a tissue that is dead).

This couldn’t happen in scholarship, because it is run by respectable academics who care about the accuracy of statements and how data is measured. To which we return later.

Is X a better scientist than Y? Is Dawn French more beautiful than Jennifer Anniston? Is she a better actress?

There are two main ways to answer these questions objectively

  • Ask human beings in a controlled trial. This means double-blinding i.e. giving the assessors material whose context has been removed (not easy for actresses) so that the assessors do not know what they are looking at and making sure those who manage the trial are ignorant of the details which could sway their judgment. The choice of questions and the management of the trial are difficult and cost money and effort
  • Creating a metric which is open, agreed by all parties, and which can be reproduced by anyone. Thus we might measure well-being by the GDP per head and the average life-expectancy. These quantities are well defined and can be found in the CIA factbook and elsewhere. (The association of well-being with these measures is, of course, subjective, and many would challenge it.) . Dawn French and Jennifer Anniston can be differentiated by their moments of inertia.

Metrics cause many problems and trials cause many problems. This is because the whole task is extremely difficult and there is no simpler way of doing it.

Is scientist X better than scientist Y? Ultimately this is the sum of human judgments – and it should never be otherwise. What are the ten best films of all time? This type of analysis is gentle fun, and IMDB carries it out by collecting votes – http://www.imdb.com/chart/top and the The Shawshank Redemption tops the list (9.2/10). Everyone will argue the list endlessly – are modern films more represented? Are the judgements made by film enthusiasts? Should they be? And so on.

Here’s a top ten scientist:

http://listverse.com/2009/02/24/top-10-most-influential-scientists/

and another

http://mooni.fccj.org/~ethall/top10/top10.htm

and another

http://www.biographyonline.net/scientists/top-10-scientists.html

None agree completely … and if you felt like it you could do a meta-analysis – analysing all the lists and looking for consistent choices. A meta-analysis migh well discard some studies as not sufficiently controlled. I’d be surprised to see a meta-analysis that didn’t have Newton in it, for example. Note that the meta analysis is not analysing scientists, it’s analysing analyzers of scientists – it makes no independent judgment.

Let’s assume that a vice-chancellor or dean wishes to decide whether X or Y or Z should be appointed. Or whether A should be given tenure. These are NOT objective choices. They depend on what the goals and rules of the organization are. One criterion might be “how much money can we expect X or Y or Z to bring in grants”. We might try to answer this by asking “how much money has XYZ brought in so far?”. And use this as a predictor.

If grant income is the sole indicator of value of a human to an institution then the institutional is likely to be seriously flawed as it will let money override judgments. Ben Goldacre gives examples of universities which have made highly dubious appointments on the basis of fame and money. But someone who brings in no grant income may be a liability. They would (IMO) have to show that their other scholarly qualities were exceptional. That’s possible, but it’s hard to judge.

And that’s the rub. Assessing people is hard and subjective. I think most scholars try hard to be objective. I’ve sat on review panels for appointments, reviews of institutions and departments. A competent review will provide the reviewer with a large volume of objective material – papers, grants, collaborations, teaching, engagement, etc. And the reviewer may well say things like:

“This is a good publication record but the last five years appear to have been handle-turning rather than breaking new ground. They will continue to bring in grants for a few years”

“If the department wishes to go into [Quantum Animatronics] then this candidate shows they have the potential to create a world-class centre. But only if you agree to support a new laboratory.”

“This candidate has a valuable new technique which can be applied to a number of fields. If the department wishes to become multidisciplinary then you should appoint them”

And so forth. None of these are context-free.

I understand that there are some US institutions that appoint chemists solely on the number of publications they have in J.Am.Chem.Soc. (Even Nature and Science don’t count). This has the advantage that it is delightfully simple and easy to administer. Given a set of candidates even a 5-year old could do the appointing. And it saves so much money. My comment is unspoken.

 

This entry was posted in Uncategorized. Bookmark the permalink.

5 Responses to What’s wrong with Scholarly publishing? Measuring quality

  1. Zen Faulkes says:

    “What is the single raison d’etre of the Journal Impact Factor in 2011?”
    For me, it’s to ensure that the journal I submit to is a real scholarly publication. There are a lot of new online journals opening up. Some of them are not credible. For me, that a journal has an Impact Factor lets me know that sending a manuscript there is not just the equivalent of burying the paper in my backyard.

    • pm286 says:

      Thanks,
      Point taken (to the extent that the org giving the JIF has looked at all journals and made some – ?arbitrary decision). This might expand it beyond 1 point.

  2. Laura Smart says:

    “What is the single raison d’etre of the Journal Impact Factor in 2011?”
    Ultimately it boils down to evaluating academics. As Zen Faulkes says, academics do use it as a measure of quality for journals where they may choose to publish, however flawed it may be. It’s an easy shorthand. Everybody within the current academic publishing system uses it in this fashion whether it be grant reviewers, hiring committees, tenure committees, peer reviewers, or faculty considering where to volunteer their effort as editors/editorial board members. Grant providing bodies use it when evaluating the publications produced from awards. Publishers may use it slightly differently: as a marketing tool for selling value. But who are they marketing to? The academics who are using the journal impact factor to evaluate one each worthiness.
    It’s been said for 15 years (or more) that the responsibility for changing the scholarly publishing system rests with changing the organizational behavior of the institutions producing the scholarship. People have to stop using journal impact factor as a judgment tool. This won’t happen until there is incentive to change. The serials pricing crisis and usage rights issues haven’t yet proved to be incentive enough, despite lots of outreach by librarians and the adoption of Open Access mandates by many institutions.
    Scholars won’t change their behavior until the current system affects their ability to get funding, get tenure, and advance their careers.

    • pm286 says:

      Thanks,
      I agree with your sentiment though I have a different take on the Impact Factor

    • Zen Faulkes says:

      “Everybody within the current academic publishing system uses it in this fashion whether it be grant reviewers, hiring committees, tenure committees, peer reviewers, or faculty considering where to volunteer their effort as editors/editorial board members.”
      That’s an example of why so many get upset about it; Impact Factor is too widely used. An author using it as quick guide as to whether a journal is credible is one thing. A tenure committee using it to decide the overall quality of a long-term research program is quite another.

Leave a Reply

Your email address will not be published. Required fields are marked *