petermr's blog

A Scientist and the Web


What’s wrong with scholarly publishing? The size of the problem

In previous posts ( and immediate backtracks) I have started to address the question of what is wrong with scholarly publishing. I haven’t actually established yet that there *is* anything wrong and I’ll do that in a day or two hence (symptoms and causes).

What is the size of the global scholarly research industry? What is the world GDP of academia? I have asked this question to many people without an answer. I’ll explain what I mean…

Money is given publicly to institutions (mainly universities but also local, national and international research institutions (STFC, CSIRO, national labs…) , including charities (e.g. Cancer Research UK)) to carry out research. I am restricting this to research work, not private contract work (e.g. work for hire that is unlikely to be published) and excluding teaching or other non-research activities. I also exclude work within for-profit companies (e.g. Glaxo Group Research (now GSK) where I used to do research). There is an expectation that this work will be “published” or “made public” – here I don’t address what this means – I shall later. The money is usually publicly accountable and may even be published. It includes funding to academia from for-profits where the contract is for “research” – this often means that the results are expected to be “published” and there is often a reduced overhead (fee) from the institution. (For example we have had funding from Microsoft and Unilever, some pharma and some for-profit publishers). The ethics of this is not in question here – I am simply establishing the scale. The point is that this “academic industry” – and such it is – is coupled to scholarly publishing in a bizarre manner, and one which I shall argue is deeply unhealthy.

So I am going to conflate terms and use “academia” to mean the institutions above. Companies (such as Glaxo and Microsoft) ultimately rely on sales and stock price for their measure of worth. Scholarly research increasingly relies on publication metrics.

So how large is academia? I find it very hard to get figures (and that is the value of a blog – I hope that some readers can help). I am happy if the figures are within half an order of magnitude – a factor of 3 either way.

I come from these directions:

  • When the Wellcome Trust fund research they allow about 1-2% for publishing. Scholarly publishing is about 10 billion /year (GBP, USD, Eur … the units are lost in the noise). So the associated research is 50-100 times higher => 500-1000 billion
  • The top universities (Cambridge, Stanford, Harvard) get about 500 million/year. There are probably about 10,000 academic institutions (with a long tail). Truncate the tail at 1000 and we might get 500 * 1000 => 500 billion

(There are limits – research is much greater than scholarly publishing and is less than the GDP of the planet). So let’s assume 500 billion.

That’s a large industry. Most industries of that size have developed an information infrastructure (e.g. for suppliers, for metrics, for government). Academia has not. Academia has let others produce information products which they then buy. Unlike some industries which regulate their information infrastructure (think supermarkets) academia lets others do this.

This has a cost – a serious cost. There is a direct cost in the information products. If we (i.e. academia) wish to get information on scholarly output (mainly scholarly publishing) we have to pay others for their information products. We have not designed these information products, nor – as far as I know, have we challenged their design and content – we take them as givens. But this (perhaps 10 billion) is not the major problem.

It gives rise to the much more serious cost – we make decisions based on information over which we have no control. The irony is that much of this basic information – the scholarly publications – is initially produced by us – in electronic form. Any competent industry would immediately use this information itself –in the overall picture it’s a tiny fraction of 500 billion (a concerted world-wide effort in academia would create at-source metrics for a few billion at most).

A feature of academia is that it is a Holy Roman Empire of thousands of players. Each tries to solve these problems by itself. In the UK every university has to create its own system for the upcoming REF (assessment exercise). Whether you think the REF is a good thing or not it seems certain that it does not compare to the competence that would be found in an industry. Yes, industry can foul up on IT and frequently does, but academia usually doesn’t even get started. Taking the axiom that the UK wishes to measure 100 institutions in the REF it seems extraordinarily inefficient to expect each to create its own information system.

The vacuum of a proper information infrastructure for the world-wide academic industry is exacerbated by the apparent need for every institution to compete aggressively against every other. In most industries this is tackled by mergers and acquisitions. When I worked in Glaxo, Richard Sykes (CEO, and then Rector of Imperial) argued that in most businesses the market leader was about 30%. And that in pharma the largest was 5 % (Glaxo). (So he went out and bought Wellcome). In universities I suspect the leader is about 0.1%. I am not saying universities should merge – I am arguing that because there is a plethora of competing institutions then the information infrastructure is archaic and exploited.

The malaise in scholarly publishing is directly of academia’s own making. We have failed to notice, let alone adjust, our own business processes with the results that others are doing it for us. And not in response to our needs but to what benefits their markets.

And in the holy market economy this is regarded as a good thing. The fault, dear Brutus, is that we have been sleeping for about 30 years and have not wakened to the fact that we are Gulliver-like tied and restricted. But if we work together on this we are vastly the largest player in the marketplace. In principle we can collectively shape or information infrastructure, especially scholarly publishing, to whatever we want.

It is not too late, but it is getting that way. I am always grateful for feedback. My next sortie, unless feedback takes me elsewhere, will be to examine the symptoms of the dystopia.

4 Responses to “What’s wrong with scholarly publishing? The size of the problem”

  1. Chris Rusbridge says:

    Peter, some estimates. HEFCE distributes £6.5B in grants to English Universities (including teaching as well as research). SFC in Scotland distributes £800M approximately. Wales and Northern ireland are much smaller, and thus well within your margin of error. Call it around £7.5B all told.

    The Guardian datablog article on research funding( says the RCs distributed £3.3B in 2008/9 (different time base but not important given the margin of error). They also mention £2.2B of higher ed funding councils, but that’s already covered. The same article says there is an additional £2.2B from other government funding including NHS. That leaves charities; Wellcome awarded around £450M in grants in 2009-10 (some of which are multi-year, but again OK for the margin of error). looks like CRUK distributed around £300M in a similar period. I’m assuming those are around half the charitable total, so maybe around £1.5B.

    So that gives us university/research funding at a bit over £12B! Quite impressive for a small country…

    • pm286 says:

      Many thanks Chris,
      This is an excellent third axis.
      This is a minimum as we can add on the for-profit funding (Microsoft, Rolls-Royce, etc.), the overseas stuff (NIH, HHMI), Foundations (Gates). Perhaps 1-2 B? Say 14B and again take the long tail of countries – say 100 countries and so multiply by 20 => 280B GBP == 400 B USD. Certainly close to the ballpark of 500 B USD

  2. Barbara Fister says:

    Go, go, go!! another brilliant contribution to a great series of posts.

    • pm286 says:

      There are at least the following to come:
      * how it used to be
      * the symptoms
      * the causes of the disease (though that may be implicit)
      * the remedies
      * the outcome (i.e. if the remedies are not applied)

Leave a Reply