What’s the Real Value of a Scholarly Publication? Part I

I’ve been invited to a very timely meeting in Oxford next week to discuss the future of Scholarship. “Open Science and the Future of Publishing” . The question I want to ask is (roughly):

“We the public pay 10 billion USD annually in journal subscription fees [*] and 200 billion USD for research; what value do WE get? And what value do WE lose by closed access?”

[*] throughout this post I use guestimates which are probably off by half an order of magnitude either way (i.e. factor of 3). This is partly because much of the information is secret (and some so secret that you will be sued if you divulge it) and partly because academia and we the public don’t yet care enough to find out. I am also removing CC-BY publications from the argument to avoid having to say “except for CC-BY” all the time. It’s about 5% of the market, if that. So I’d like your help.

I am also working this up for a (unfortunately virtual) presentation I am giving in Poland next month. I am taking my text from Wikipedia: (This is 6 years old and not disputed so I take it as more-or-less correct. If anyone can fault this, we shall all benefit)

Let me tackle COST and PRICE first.

The COST to the public purse of scholarly publishing is of the order of 10 billion USD. There are also contributions from industrial subscriptions, and from student fees, and 1% from pay-per-view, but the bulk is from taxpayers. In return for this the public get virtually no value or rights. If you the public, you the government, you the NHS want to read a paper you either have to pay again or walk to St Pancras and read it in the British library premises (you cannot get this online because of publisher restrictions – mad and sad but true. The BL even charges me to read my own CC-BY papers if I’m not at St P.).

This is set by the PRICE of electronic journals. This bears no relation to the COST of production. The cost of production can be very low. It’s USD 7 for ArXiV (not peer-reviewed) and about 100 USD for Acta Cryst E (a very high-quality peer-reviewed data journal). In an efficient organisation it’s inconceivable that the COST of production of a journal article is more than 200 USD. Any higher PRICE comes from the following:

  • The ADDED_VALUE that the publishers assert they add
  • Inefficiencies (often gross) in the publishing system. (For example almost all author manuscripts are retyped from scratch).
  • Profits

Publishers like Nature estimate costs-per-paper at 20,000 USD. That is not related to the cost of production but something else. Perhaps the high rejection rate? The basis of these “costs” is kept highly secret.

The PRICE of pay-per-view articles (about 35 USD for one day’s rent) is the only part with real elasticity . The only evidence I have is from my FOI requests to Oxford/Cambridge University presses (they are public organizations, parts of the Universities, so have to reply – if you want publishing facts consider University presses).

CUP:  [ ]

In 2010, 13,646 articles were purchased as PPV. In 2010, the total number of articles for potential purchase via CJO was 680,000.  Revenues from PPV approximated to 1.3% of Journal subscription revenues in 2010. 

OUP [ ]

 In 2010, 37,157 PPV articles were purchased [OUP do not know how many purchasable articles they publish]  PPV represents around 1.5% of total journal subscription income. 

I take heart from the consistency of the figures (TWO coincident points!) and surmise that other publishers get 1.5% of their income from Pay-per-view. It’s possible, but unlikely, that the large profits of other publishers comes from Pay-per-view but I and you will doubt that. It’s clear that the price is far too high and it amazes me that publishers still use these levels which were – I assume – set by the cost of paper in interlibrary loans. I’m no economist, but it’s actually stupid to run these prices . If they cut their prices to a fifth – 7USD – and gained 5 times more custom they’d still make the same income, incur no more costs (really!) and gain a great deal of goodwill. And even if they gained no more readers they’d only have lost 1% of their income. But they probably know something about a small subset of customers who have to use this service and they don’t care about everyone else. Which is also inelastic.

If any closed access publisher can give figures here we’d be delighted.

It’s also a serious condemnation of the effort to promote scholarship. Only 2% or all articles are ever purchased each year. I imagine the 680,000 includes historical articles, and if we take this as 50 years, then each modern article is purchased about once each year. Which shows that it’s value to the public is almost zero.

We now need to establish the cost of public (include charity) funded research. I have asked many times without finding authoritative results. So here’s a beer-mat calculation, and allow +- half an order of magnitude. I approach it from these directions:

  • Wellcome Trust allow about 2% of a grant to cover publishing. So if scholarly publishing is USD 10 billion, then public research is 500 billion USD
  • The income for Cambridge, Stanford, etc is ca 500 million. Assume 1000 research universities in the world (can anyone do better?) and a power law and we get ca USD 200 billion
  • The NIH is funded at USD 35 billion. It’s probably the largest, but add in national funders and you are well over USD 100 billion

Let’s use a figure of USD 200 billion (though I am sure it’s higher).

I’m now using VALUE in the sense (from Wikipedia):

Value in the most basic sense can be referred to as “Real Value” or “Actual Value.” This is the measure of worth that is based purely on the utility derived from the consumption of a product or service. Utility derived value allows products or services to be measured on outcome instead of demand or supply theories that have the inherent ability to be manipulated. Illustration: The real value of a book sold to a student who pays $50.00 at the cash register for the text and who earns no additional income from reading the book is essentially zero. However; the real value of the same text purchased in a thrift shop at a price of $0.25 and provides the reader with an insight that allows him or her to earn $100,000.00 in additional income is $100,000.00 or the extended lifetime value earned by the consumer. This is value calculated by actual measurements of ROI instead of production input and or demand vs. supply. No single unit has a fixed value. Value is intrinsically related to the worth derived by the consumer. [Burke(2005)].

And asking “What VALUE do the public get for their 200 billion dollars?”


“what extra VALUE would they get if the research was published openly?”

And again, if you have insights let me know.

16 Responses to “What’s the Real Value of a Scholarly Publication? Part I”

  1. Anna Sharman says:

    I’d like to comment on one point you make: \In an efficient organisation it’s inconceivable that the COST of production of a journal article is more than 200 USD.\ As someone who has been employed by and worked freelance for many journal publishers, I don’t think it is at all inconceivable that it is much higher than this. Firstly, you seem not to be separating the costs of peer review and of distribution. The main reason Nature’s costs are so high is because they organise the peer review of many papers that they end up rejecting. But concentrating just on production, copyediting a full length research paper takes an editor a minimum of four hours, I’d say, and freelance editors are generally not paid below about 25 USD per hour, so that is at least 100 USD before you have factored in hosting the servers, getting or writing and supporting software and databases for manuscript management and publication, a proportion of office overheads, etc etc. So the cost of production is certainly higher than you estimate. And please don’t say that copyediting isn’t necessary – in my experience there are extremely few papers that don’t have at least one substantive error that is spotted in copyediting, and most have ambiguities or unclear writing or figures.

    • pm286 says:

      Thank you Anna,
      This is a helpful response.
      I used the word “production” which is exactly the same as you did. I made no comment on the peer-review.

      I am exactly going to say that publisher-provided copy-editing is not essential. When scientists create other documents such as grant applications, reports, etc. they don’t get third-parties to edit them. What’s so special about papers? And note that Springer have abandoned copy-editing and asked graduate students to do it for free.

      Cameron Neylon has publicly stated that the contribution of publishers to his last 10 papers was ZILCH. Maybe he is exceptional.

      So I stand by my figures. IUCr produce papers, get them reviewed by humans and machines, distribute them for 100 USD. That’s because they are more imaginative and more efficient. They may have fewer words but they have a lot more complex data. We have to create publishing systems without copy-editors.

      But, of course, that’s not the main financial problem. 100 USD for copy-editors is a small hole in 20,000 USD

  2. The conventional wisdom is that copy-editing is valuable, though “whether the magnitude of improvement is worth the effort, is a separate question.”[ref]

    I say that it is, particularly for authors for whom English is a second language. Does poor expression in the writeup reflect poor work throughout the study? Probably not, but a manuscript with linguistic errors does raise that question in the mind of the reader.

    A fresh eye on any piece of writing by someone who is NOT familiar with the work is valuable in ensuring that there are no infelicities of style or structure, that footnotes are consistent, that tables and figures correctly referenced, and that internal references to other parts of the text are unambiguous. This saves time and frustration for all subsequent readers.

    The more difficult question as to whether the author’s assertions are supported by the evidence presented is in fact better addressed using an open reviewing process, such as that well-established at BioMedCentral. The evolving debate around an article over the course of its review process is fascinating to follow and arguably more illuminating that if the article had merely appeared in its final form.

    A clear, flawlessly presented manuscript will help everybody along the way.

    [Ref: see: Goodman SN et al Manuscript quality before and after peer review and editing at Annals of Internal Medicine. Ann Intern Med 1994;121:11-21
    Pierie JPEN et al Readers' evaluation of the effect if peer review and editing on quality of papers in Nederlands Tijdschrift voor Geneeskunde. Lancet 1996;348:1480-3
    quoted in Godlee F, Jefferson T. Peer review in the health sciences London: BMJ Books, 1999]

    Conflict of interest: I make a humble living editing scientific manuscripts.

    • pm286 says:

      Thanks very much, Douglas
      Useful contribution

      I can accept that copy-editing is a useful contribution. It’s a clear service to the author and reader. Does it have to be provided by the publisher? or could we have a copy-editing service, and run on a reasonable market basis? I can see c/e being outsourced to Mechanical Turk quite effectively. Better than Springer asking students to do it for free

  3. >Does it have to be provided by the publisher?

    Not any more. It made sense to combine copyediting with markup for the typesetter in dead tree dissemination systems, but in modern systems it is probably best conceived as a staged process, in which the authors first make appropriate checks to ensure that they have indeed communicated as they intended, and a second stage in which that communication is placed in its global public context.

    The first stage involves applying a set of style heuristics to a text—(“Avoid acronyms. If an acronym can’t be avoided, ensure it is spelled out on its first occurrence.” etc etc). The second ensures conformance with the many exigencies of the placement and perpetuation of an article in the electronic firmament (DoI references, XML markup schemas, server admin, article-level metrics, commenting systems etc etc).

    I’m not sure that the randomness of Mechanical Turk would be the best place for authors to turn to for support in the first stage: rather, see, for example

  4. This piece, by a Harvard scientist who’s also worked in publishing, is a sane discussion of the value added by the “glamour mags,” and the implications for the open access movement.

  5. Jim A says:

    Of course if the publishers charged only $7 per article, they’d probably sell fewer subscriptions. Possibly far fewer. Academic articles are kind of like movie stars. A very tiny number are wildly popular, but most of them are quickly forgotten, even those in journals with high impact factors.

  6. Mike Taylor says:

    “Inefficiencies (often gross) in the publishing system. (For example almost all author manuscripts are retyped from scratch).”

    That can’t be true.

    It just can’t.

    For one thing, if it was, I’d see a lot more typos in my papers that I didn’t perpetrate, and a lot of my own typos inadvertently fixed.

    • pm286 says:

      I don’t have time to research the “almost all” but I have enough anecdotal experience to know that it’s common. The typists create XML.

      Readers, Can we have a definitive answer, please?

  7. Mike Taylor says:

    “If any closed access publisher can give figures here we’d be delighted.”

    Most people would agree that the term “closed-access” is a bit clumsy, being a rather contrived opposite of “open-access”. I have recently started using the term “barrier-based” for this kind of publishing business, as the model is entirely dependent on imposing artificial barriers to access and then charging to lower them. If you find that term useful, let’s start spreading it.

    • pm286 says:

      The common term is “toll-access” or TA.
      The terminology in this whole area is awful, often deliberately confused.
      I suggest we expose the whole terminology problem on open-access list

      • Mike Taylor says:

        Yes, I see “toll-access”, and I know that the publisher themselves prefer terms like “subscription model”. But at bottom, it’s all about barriers, so I think that “barrier-based” is the most honest and descriptive term. Plus it helps keeps the discussion focussed where it belongs.

        When I think about paywall systems, Athens, Shibboleth, LOCKSS, CLOCKSS and all the rest, I am reminded of this observation from Henry Spencer’s fine old document Ten Commandments for C Programmers: “thy creativity is better used in solving problems than in creating beautiful new impediments to understanding”.

