petermr's blog

A Scientist and the Web


Archive for July, 2011

What’s wrong with Scholarly Publishing? New Journal Spam and “Open Access”

Saturday, July 16th, 2011

I got the following SPAM (unsolicited bulk mail) today. (There seems to be an assumption that SPAM for conferences, journals, etc, is OK. It’s not. It wastes my time and leads to errors. If I get (say) 5 invitations a day to “speak” at conferences whose acronyms I don’t know I miss those few which genuinely want me to attend. It’s irresponsible and unacademic. From today:

Dear Prof. Murray-Rust:
Greetings! I hope you are well. On behalf of IGI Global, I would like to invite you to share your current research interests in the form of an editorship capacity. As you may know, IGI Global is an internationally-recognized publisher of high quality scholarly reference books, journals and teaching cases.

Introducing ??International Research Journal of Library, Information and Archival Studies?

The   International Research Journal of Library, Information and Archival Studies is a multidisciplinary peer-reviewed journal that will be published monthly by International Research Journals ( IRJLIAS is dedicated to increasing the depth of the subject across disciplines with the ultimate aim of expanding knowledge of the subject.

Call for Research Articles

We invite you to submit a paper /abstract /poster /workshop to the 4th Qualitative and Quantitative Methods in Libraries International Conference (QQML2012), 22 – 25 May 2012, Limerick, Ireland.


2011 the 2nd International Conference on system science, engineering design and manufacturing informatization (ICSEM 2011 )

On behalf of the Scientific and Organizing Committees it is our great pleasure to invite you, together with accompanying persons, to attend the 2011 the 2nd International Conference on system science, engineering design and manufacturing informatization (ICSEM 2011 ),

This is all simple SPAM and I have to filter it out by hand (all conferences look so similar no machine learning will work). But because I am blogging on scholarly publication I stopped to look at the following – and it’s an excellent illustration of New Journal SPAM. Firstly it is, of course, simple SPAM because I didn’t ask for it in my mail box. But it’s more instructive than that.


Dear Researcher,

Greetings from the Modern Economy(ME) ,which is published by Scientific Research Publishing ( SRP ), USA.The aim of the International Journal of Modern Economy is to provide a forum for scientists and social workers to present and discuss issues in international economics.

Normally this goes in the SPAM bin immediately but I thought I’d follow this up. Needless to say I haven’t heard of SRP. So I went to their home page ( ). 150 Open Access journals. Wow! This must be a GOOD THING…

Hang On… “Open Access” does not equal “good”. Open Access can be good or bad or in between. Open Access means only one thing – anyone can read it without payment. “Open” is now frequently being used in the same way as “Healthy” or “Green”. More a marketing term than a precise description. “Open” does not always mean Open Definition compliant (I’ll leave this as a surprise…) . And even if it does that is all it means. “Free to use, re-use and redistribute for any purpose and without restriction save for acknowledgement”. That does not mean good or bad, useful and useless. Be very clear on that because there are a large number of new Open Access journals and IMO some of them use Open as a marketing term.

So, anyway, SCIRP publishes chemistry. There are very few Open chemistry journals (the only non-specialist one is Beilstein Journal of Organic Chemistry – PloS doesn’t chemistry). So a new one is welcome – in principle. Let’s have a look at: Journal of Organic Chemistry”

It’s got an ISSN – that simply requires payment. SCIRP is a member of CrossRef. I do take some assurance from that – I know the Crossref people and I assume they have some minimal barrier to entry. They have rules for membership ( ). These mainly relate to the management of metadata and DOIs (which is Crossref’s business). To continue, let’s look at the mission of the journal…

International Journal of Organic Chemistry (IJOC) is an international, specialized, English-language journal devoted to publication of original contributions concerning all field of organic chemistry.

It is an open-access, peer-reviewed journal describing the synthetic approached and the design of new molecules for applications in medicinal chemistry, but also for the synthesis of precursors for new materials (solid or liquid) that have an interest in materials sciences and nanotechnology, homogeneous and heterogeneous catalysis.

Contributions that concerns with analytical characterization, advanced techniques and the studies of properties for new synthesized molecules will also be highlighted. All manuscripts must be prepared in English, and are subject to a rigorous and fair peer-review process. Accepted papers will immediately appear online followed by printed hard copy. The journal publishes original papers including but not limited to the following fields:

    Fluorescent Molecules and Dyes,     Organo-metallics,     Polymers,     Surfactants Among Others,     Synthesis of Reagents

The journal publishes the highest quality original full articles, communications, notes, reviews, special issues and books, covering both the experimental and theoretical aspects of organic chemistry.

Papers are acceptable provided they report important findings, novel insights or useful techniques within the scope of the journal. All manuscript must be prepared in English, and are subjected to a rigorous and fair peer-review process. Accepted papers will immediately appear online followed by prints in hard copy. It will be available through

(There are some apparent illiteracies, but I’ll pass …) So far so good. Let’s look at the editorial board. ( ). Wow! 50 names (I’ve heard of one, but I don’t move much in synthetic organic circles, so don’t count that). Now the papers:

“One-Pot Three-Component Synthesis of Imidazo[1,5-a]pyridines”

I understand what that means. I know what a Imidazo[1,5-a]pyridine is. Assuming this is factually correct, this is solid chemical science – potentially useful to other chemists who want to know how to make this type of compound. The bedrock of factual labotatory science.

I can text mine this! It’s Open, isn’t it. Let’s find their definition of Open Access… Can’t find it … What’s the c opyright? The paper carries:

Copyright © 2011 SciRes

This is NOT OK

This is NOT OKD compliant. It might be regarded as Green Open Access but it’s not Gold. And I can’t text-mine it.

Lesson: “Open” means almost nothing unless defined.

But, at least it’s readable by anyone. So I am intrigued. I haven’t heard of SCIRP so I’ll look in Wikipedia.

What, Wikipedia? All academics know that is unregulated junk. Well, the stuff that I (PM-R) wrote in Wikipedia is correct to the best of my ability. And I am coming to believe in the correctness of Wikipedia in sciences to a great level than many other conventional sources. Anyway maybe Wikipedia can tell us how old SCIRP is . From

Scientific Research Publishing is an academic publisher of open access
electronic journals. The company created a controversy when it was found that its journals duplicated papers which had already been published elsewhere, without notification of or permission from the original author. In addition, some of these journals had listed academics on their editorial boards without their permission or even knowledge, sometimes in fields very different from their own. A spokesperson for the company commented that these issues had been “information-technology mistakes”, which would be corrected.[1]

Well it’s a stub entry. From a single source. But from the various edits it seems likely that the company started in 2008 and added several new journals each month. The original stub seems to have been catalysed by a Nature article:

Sanderson, Katharine. “Two new journals copy the old”. Nature 463, 148 (2010)

Let’s have a read of that. After all Nature brands itself as “The world’s best science and medicine on your desktop”. I assume that is agreed by the whole publishing community. (But then what does “best” mean? I can brand my science as the “best” approach to semantic chemistry. ) Let’s have a look:

Log in

Sign up for free access to this article, or log in with your account

Why do they want me to sign up? Probably because they want to add me to their “direct mailing list”. Well, sorry Nature, I’m not going to. I’ll go to Pubmed – not much there. Pubmed are scrupulously careful not to violate the “rights” of publishers. Which means we don’t get to read things. UKPMC (on whose advisory board I am) cannot give help either. So finally to the blogosphere (through Google) and I find a (quite by chance) recent post I’m copying iyt in full as it has useful links:

Adventures in fake academic publishing: SCIRP

Here’s a new ‘journal’ in philosophy published by the disreputable SCIRP. [PMR possibleexperience's phrase, of course, not mine as I have an open mind] The business model of this outfit is to charge authors for ‘publication’ in their online ‘journals’ (rather than charging readers for access to the articles), charging charge $300 for the first ten pages, $50 for each page thereafter, as stated in ‘author’s guidelines’:

Your paper should not have been previously published or be currently under consideration for publication anywhere else. Papers should be submitted electronically through the Open Journal of Philosophy (OJPP) Submission System. All papers are subject to peer review. After a paper is accepted, its author must sign a copyright transfer agreement with OJPP. Papers accepted for publication will be made available free online. The modest open access publication costs are usually covered by the author’s institution or research funds ($300 for each paper within ten printed pages, and $50 for each additional page). Scientific Research Publishing may grant discounts on paper-processing fees for papers from lower income countries, or by students, or authors in financial difficulty. The amount of discount will depend on a variety of factors such as country of origin, quality of the work, originality of the article, and whether this particular article was submitted at the invitation of the editor-in-chief. Since only about 20% of papers published in each issue will receive the discounts, there is no guarantee that a discount will be granted to every author who meets the requirements.

SCIRP created a stir last year when at least two of its journals were caught republishing papers without permission.

Two new journals copy the old At least two journals recently launched by the same publisher have duplicated papers online that had been published elsewhere. Late last year, an organization called Scientific Research Publishing reproduced the papers in what its website ( billed as the first issues of the new journals Journal of Modern Physics and Psychology. Huai-Bei Zhou, a physicist from Wuhan University in China who says he helps to run Scientific Research’s journals in a volunteer capacity, says that the reproductions were a mistake…

  What is the quality of their publications? Here’s one from Advances in Bioscience and Biotechnology, another of their journals:

Molecular genetic program (genome) contrasted against non-molecular invisible biosoftware in the light of the Quran and the Bible,” Pallacken Abdul Wahid, Advances in Bioscience and Biotechnology, vol. 1, no. 4, 2010, pp. 338-47.

“[The] most striking one is that a living cell and its dead counterpart are materially identical, i.e., in both of them all the structures including genome are intact. But yet the dead cell does not show any sign of bioactivity. This clearly shows that the genome does not constitute the biological program of an organism (a biocomputer or a biorobot) and is hence not the cause of “life”. The molecular gene and genome concepts are therefore wrong and scientifically untenable. On the other hand, the Scriptural revelation of the non-molecular biosoftware (the soul) explains the phenomenon of life in its entirety.”

PMR. Well at least a reputable journal from a reputable publisher would never publish an article that mixed science with religion in this way, would they? You would never get an article about proteomics and creationism from a reputable journal, would you?

Unless you know different…












What’s wrong with Scholarly Publishing? Your feedback

Saturday, July 16th, 2011

I asked a simple question:

“What is the single raison d’etre of the Journal Impact Factor in 2011?”

And have had two useful answers:

Zen Faulkes says:

July 15, 2011 at 12:19 pm 

For me, it’s to ensure that the journal I submit to is a real scholarly publication. There are a lot of new online journals opening up. Some of them are not credible. For me, that a journal has an Impact Factor lets me know that sending a manuscript there is not just the equivalent of burying the paper in my backyard.


Laura Smart says:

July 15, 2011 at 4:29 pm 

Ultimately it boils down to evaluating academics. As Zen Faulkes says, academics do use it as a measure of quality for journals where they may choose to publish, however flawed it may be. It’s an easy shorthand. Everybody within the current academic publishing system uses it in this fashion whether it be grant reviewers, hiring committees, tenure committees, peer reviewers, or faculty considering where to volunteer their effort as editors/editorial board members. Grant providing bodies use it when evaluating the publications produced from awards. Publishers may use it slightly differently: as a marketing tool for selling value. But who are they marketing to? The academics who are using the journal impact factor to evaluate one each worthiness.

It’s been said for 15 years (or more) that the responsibility for changing the scholarly publishing system rests with changing the organizational behavior of the institutions producing the scholarship. People have to stop using journal impact factor as a judgment tool. This won’t happen until there is incentive to change. The serials pricing crisis and usage rights issues haven’t yet proved to be incentive enough, despite lots of outreach by librarians and the adoption of Open Access mandates by many institutions.

Scholars won’t change their behavior until the current system affects their ability to get funding, get tenure, and advance their careers.

These are valuable comments and I’ll use them to introduce why I think we have created Monsters of the Scholarly Id. The JIF is probably the worst as it is not only flawed but its use shows that academia does not really care about measuring quality. The JIF was not created by academia, it was created by publishers as a branding instrument. And that is precisely what it is – a branding tool, created by the manufacturer. It was neither designed nor requested by academia but, as Laura says, it has been adopted by them. They do not control it and so they are in its grip (more later).

Brands can be valuable. gives The American Marketing Association defines a brand as a “name, term, design, symbol, or any other feature that identifies one seller’s good or service as distinct from those of other sellers “. The branding of household products in the 19th century by pioneers such as William Hesketh Lever ( ) where “The resulting soap was a good, free-lathering soap, at first named Honey Soap then later named “Sunlight Soap“. Until that time soap had been of highly variable quality and the branding by Lever allowed customers to associate a brand with consistent and high quality. Many other businesses followed suit.

It is fairly easy to determine whether soap is of good quality or substandard. Whether a car is reliable or breaks down. For many other products (beer, clothes, fragrances, …) the association depends on subjective judgments including a large amount of personal preference. Which brings us to the branding of journals.

I am going to argue later that we do not need journals and that they are increasingly counterproductive. However, assuming that we do need them, is branding useful? Branding is now common – the journal carries the publisher’s name and may have a consistent visual look-and-feel. But visual consistency does not mean valuable or even consistent science.

Journals are – unless you tell me otherwise – unregulated. And that’s how it should be. Anyone can set up a journal. founded a journal “The volumes from his lifetime are often referenced just as Liebigs Annalen; and following his death the title was officially changed to Justus Liebigs Annalen der Chemie.” Many blogs have the effective status of journals and many contain high-quality scientific content. (Certainly I would encourage anyone who had something to communicate about scholarly publishing to blog it rather than using a scholarly journal such as Serials Review). So Zen quite rightly asks (implicitly) about journal regulation.

I think he is right to ask for it, though the JIF is not a regulation – it’s a branding sought by the publisher for the benefit of the publisher. Nothing specifically wrong with that, but to assume it acts in the interests of academia is to misunderstand branding. It’s primary purpose is to give a single, apparently objective and regulated, number giving an apparent indication of quality. I use the word “apparent” as that is what academia consumes, but since the process of JIF creation is not transparent it is not objective. (That is separate from whether it measures anything useful – which I believe it does not).

Because the number of journals has risen so rapidly it is impossible, even within a field, to determine the standing of any particular one. (Why it has risen I’ll try to deal with later, but it’s not because readers are asking for more journals). So presumably we can rely on the reputation of a publisher justifying the quality of a journal.

Unfortunately not. (This news is 2 years old…) See where The Scientist in 2009 revealed that

Elsevier published 6 fake journals

Scientific publishing giant Elsevier put out a total of six publications between 2000 and 2005 that were sponsored by unnamed pharmaceutical companies and looked like peer reviewed medical journals, but did not disclose sponsorship, the company has admitted.

Read more: Elsevier published 6 fake journals – The Scientist – Magazine of the Life Sciences


This is not denied by Elsevier who stated:

“We are currently conducting an internal review but believe this was an isolated practice from a past period in time,” Hansen continued in the Elsevier statement. “It does not reflect the way we operate today. The individuals involved in the project have long since left the company. I have affirmed our business practices as they relate to what defines a journal and the proper use of disclosure language with our employees to ensure this does not happen again.”

It is gratifying that Elsevier have indicated that there was – in 2011 language – a “single rotten apple” and that the problem has been cleaned up and we can relax for the future. And I am sure they are grateful to the Scientist for discovering the problem which lay undetected for several years. Nonetheless it shows the commercial pressure to publish journals. Unlike the journals I grew up with, which were the outputs of learned societies and were to promote science, the primary purpose of most (not all) of today’s journals is to make money. (In the MGS we subsidised the journal from the membership – tempora mutantur). I talked about 4 years ago to someone whose business was creating new journals. His recipe:

  • Find an area (his was medical) where he could create a niche demand. The demand didn’t have to exist, it just had to be creatable.
  • Create a journal, with luminary editorial board. Find the senior editor. Academics like to be on boards. It makes them look good on their CV. Sometimes they even get jollies. (Disclaimer: I have had one free jolly – a (working) breakfast from the J. Cheminformatics: Coffee, fruit, donuts – probably 10USD).
  • Get a reasonable number to submit papers for the first issue. They won’t be critically reviewed will they? After all it’s the editorial board. And we need it to look good. Doesn’t really matter if you take and old paper, rework it a bit as a review with some new work. And get the grad student to do the hard work of the references and some pretty pictures.
  • Get academic libraries to subscribe (this was closed access, reader pays). Most very large universities would do this.
  • Wait two years and sell the journal to a major publisher for ca 100K GBP


Everyone benefits.

Except academia, who has subscribed to yet another albatross. But there’s lots of money in the system. And anyway the researchers don’t pay, the library does. And we need the freedom to publish, don’t we?


Checklist of monsters (MOTSI) so far:

  • The branded journal
  • The new journal
  • The journal impact factor


(there’s more to come). But since this is already a long post, let’s have a separate post on the worst of the new journal…


So, Zen, we do need an independent reviewer of scholarly publishing. A consumer magazine “Which Journal?” But the impact factor, which is negotiated by publishers with non-answerable commercial companies in a closed process does not provide it. . It should be academia, but it doesn’t seem to be. After all there are more urgent things to do than monitor our own quality. We’ll do the research and let the commercial sector tell us how. (And in “commercial” I include the major non-profit societies which have become unbalanced and use publishing to fund their activities rather than the other way around).


So your next assignment (after all we rely on citations so much)


“What is a citation?”


Answers within 24 hours welcome








What’s wrong with Scholarly publishing? Measuring quality

Friday, July 15th, 2011

I’m starting all these posts with “What’s wrong with Scholarly publishing?”. That’s because I am getting feedback, which includes young researchers who are following them, and libraries/academics who wish to use them as resources material. I’ll note that I do not put enough effort into creating hyperlinks – it takes a surprising amount of effort and I’d like to see better tools (e.g. Google or Wikpedia searches for concepts).

Blogs have a mind of their own – I didn’t know a week ago I would be writing this post – and this topic has grown larger than I anticipated. That’s partly because I think it takes us slightly closer to the tipping point – when we see a radical new approach to scholarly publishing. I’m not expecting that anything is directly attributable to this blog. But it all adds up and acts as an exponential multiplier for change.

I’ll be writing later on the dysfunctionalities of the publishing system – “Monsters of the Scholarly Id” – that we academics have unwittingly (but not blamelessly) created. These MOTSI are set to destroy us and I’ll look at each in detail and also ask for y/our input. If you want to anticipate, try today’s homework:

“What is the single raison d’etre of the Journal Impact Factor in 2011?”

Feel free to comment on this blog – I’ll give my analysis later – perhaps in 2 days.

Meanwhile, since we shall come later and in depth to measurement of quality in SchPub, let’s see how we measure quality and utility objectively.

For those of you who don’t know him, read Ben Goldacre’s column ( ). It’s more than an attack on bad science – it’s also a simple and compelling account of how to measure things accurately and well. How to sho whether product X has an effect. Whether A is better than B (and what better means). Whether government policies work.

From a Kryptonite ad: “70% of readers of WonderWomanWeekly say that Kryptonite gave their hair more life and volume”. You’ll all recognize this as marketing crap. Almost everything is untestable. How was the survey carried out (if indeed it was)? Did Kryptonite sponsor the survey? What does “volume” mean? (It’s not determined by sticking your head in a measuring cylinder?) It’s a subjective “market performance indicator”. What does “life” mean (for a tissue that is dead).

This couldn’t happen in scholarship, because it is run by respectable academics who care about the accuracy of statements and how data is measured. To which we return later.

Is X a better scientist than Y? Is Dawn French more beautiful than Jennifer Anniston? Is she a better actress?

There are two main ways to answer these questions objectively

  • Ask human beings in a controlled trial. This means double-blinding i.e. giving the assessors material whose context has been removed (not easy for actresses) so that the assessors do not know what they are looking at and making sure those who manage the trial are ignorant of the details which could sway their judgment. The choice of questions and the management of the trial are difficult and cost money and effort
  • Creating a metric which is open, agreed by all parties, and which can be reproduced by anyone. Thus we might measure well-being by the GDP per head and the average life-expectancy. These quantities are well defined and can be found in the CIA factbook and elsewhere. (The association of well-being with these measures is, of course, subjective, and many would challenge it.) . Dawn French and Jennifer Anniston can be differentiated by their moments of inertia.

Metrics cause many problems and trials cause many problems. This is because the whole task is extremely difficult and there is no simpler way of doing it.

Is scientist X better than scientist Y? Ultimately this is the sum of human judgments – and it should never be otherwise. What are the ten best films of all time? This type of analysis is gentle fun, and IMDB carries it out by collecting votes – and the The Shawshank Redemption tops the list (9.2/10). Everyone will argue the list endlessly – are modern films more represented? Are the judgements made by film enthusiasts? Should they be? And so on.

Here’s a top ten scientist:

and another

and another

None agree completely … and if you felt like it you could do a meta-analysis – analysing all the lists and looking for consistent choices. A meta-analysis migh well discard some studies as not sufficiently controlled. I’d be surprised to see a meta-analysis that didn’t have Newton in it, for example. Note that the meta analysis is not analysing scientists, it’s analysing analyzers of scientists – it makes no independent judgment.

Let’s assume that a vice-chancellor or dean wishes to decide whether X or Y or Z should be appointed. Or whether A should be given tenure. These are NOT objective choices. They depend on what the goals and rules of the organization are. One criterion might be “how much money can we expect X or Y or Z to bring in grants”. We might try to answer this by asking “how much money has XYZ brought in so far?”. And use this as a predictor.

If grant income is the sole indicator of value of a human to an institution then the institutional is likely to be seriously flawed as it will let money override judgments. Ben Goldacre gives examples of universities which have made highly dubious appointments on the basis of fame and money. But someone who brings in no grant income may be a liability. They would (IMO) have to show that their other scholarly qualities were exceptional. That’s possible, but it’s hard to judge.

And that’s the rub. Assessing people is hard and subjective. I think most scholars try hard to be objective. I’ve sat on review panels for appointments, reviews of institutions and departments. A competent review will provide the reviewer with a large volume of objective material – papers, grants, collaborations, teaching, engagement, etc. And the reviewer may well say things like:

“This is a good publication record but the last five years appear to have been handle-turning rather than breaking new ground. They will continue to bring in grants for a few years”

“If the department wishes to go into [Quantum Animatronics] then this candidate shows they have the potential to create a world-class centre. But only if you agree to support a new laboratory.”

“This candidate has a valuable new technique which can be applied to a number of fields. If the department wishes to become multidisciplinary then you should appoint them”

And so forth. None of these are context-free.

I understand that there are some US institutions that appoint chemists solely on the number of publications they have in J.Am.Chem.Soc. (Even Nature and Science don’t count). This has the advantage that it is delightfully simple and easy to administer. Given a set of candidates even a 5-year old could do the appointing. And it saves so much money. My comment is unspoken.


What’s wrong with scholarly publishing? Those who are disadvantaged speak

Thursday, July 14th, 2011


I publish in full an unsolicited comment, which expresses exactly why closed access publishing has become unacceptable.


Bill Roberts says:

July 14, 2011 at 8:00 am 

As a non-academic but occasional reader of published academic papers, the current system of publishing actively deters me from reading the best work of scientists. If I was a researcher working in an institution and needing to read papers every week, then I suppose the journal subscription systems are workable. But for me, every time I hit a $30/article paywall, I simply go back to google and look for blog posts or preprints (or the one or two open journals) instead.

As a researcher, then clearly there is advantage in professional status and an advantage for the institution in getting papers into prestigious journals, but this is at the cost of *actively preventing* a proportion of the potential audience for the paper from ever seeing it.

I’m sure I could get some kind of ‘affiliate membership’ of a university library and so get access that way, but the marginal benefit each time isn’t big enough to make me do that.

The web ought to be the ideal medium for coping with ‘long tail’ people like me, but as you have so clearly pointed out on several occasions, the current system of academic publishing has conspicuously failed to take advantage of the possibilities offered by the web.

Bill (who I don’t know) expresses precisely the inequity of the current system. His taxes (wherever) go to support research, go to support university libraries, but he cannot have access to the results. I am not arguing that the system should be cost-free, but that all parties should be rapidly working towards a sustainable business model. One that allows Bill to have access to the literature.

If you are an academic reading the literature, next time you celebrate another paper in NatScICell think of Bill. Think of the people suffering from the disease that you might, in years, have some comfort for. Think of the patient groups who have collected on the streets, given legacies, to fund your research. Who cannot even read what they have worked to supported. It is the arrogance of academics which is fuelling this system.

And publishers (and I have been sparing of criticism so far), think whether charging $40 to read a 1 page article for 1 day (Serials review) is advancing the cause of science. Think what the effect actually is. You are alienating Bill. The service of communication is replaced by the tyranny of gatekeeping. Bill doesn’t pay your prices and I suspect very few do. You are simply advertising that you don’t care about Bill. You can buy popular science magazines weekly for $5 – I’m not an economist but that doesn’t upset me. But $40 for 1 page for 1 day is inexcusable. And, as I shall comment on later, charging by the article ignores that fact that many readers now never read articles all the way through.

If both parties (academics and publishers) keep on in their narrow world where only the privileged exist the scholarly world will fracture. That may be sooner than we think. Murdoch had zero public support here, and has crashed. If you need public support, you won’t find it from Bill.

What’s wrong with scholarly publishing? How it used to be

Wednesday, July 13th, 2011

While waiting for feedback (and there’s a good discussion on Friendfeed) here’s a (probably rosy-tinted) ramble through history…

I started my research almost 50 years ago and did my doctorate in 2 years (required for chemistry as we had a fourth year of research in my first degree). During that I did several crystal structures (it was becoming slightly mechanised but I had to measure up to 50,000 diffraction intensities by eyeballing photographic films *and typing them up*. I created a thesis (which did not impress my (very famous) examiners and I am told I passed on the strength of my viva). That thesis is in the Bodleian and – thanks to Sally Rumsey should be being digitized RSN. You will then be able to judge its quality (it is really not too bad – and would now simply require corrections.

The process of publication was technically much harder. It took 2 days to create a picture of a crystal structure. It needed the coordinates creating with sine tables and worse. Then drawing with a Rapidograph. When I got it wrong I had to scrape the errors off with a razor. All data had to be punched up and included. I went straight into an assistant lectureship at the then new University of Stirling. Straightforward benevolent nepotism – I was invited to apply by my college tutor, Ronnie Bell, who took up the chair of chemistry. I was in the right place at the right time.

Anyway during my DPhil I published my first paper – a rather fun structure – in Chemical Communications. This was a new (and I felt exciting) journal where you wrote a brief account of the work and (at that stage, I think) were expected to write it up fully later (e.g. with the data). There was a feeling of competition – only interesting chemistry was accepted. Not sure I got much feedback. After getting to Stirling and recovering from the DPhil viva I wrote up the other structures and sent them off to J.Chem.Soc (the Chem Soc – now the RSC – had only one main journal (I think there was also Faraday Discussions) – but it was really a single national journal.

There was a clear feeling that you published in your national journal – UK=> JCS, Scandinavia=>Acta Chem Scand, US=>JACS, CH=>Helvetica Chimica, etc. If you had particularly specialist crystallographic material it was Acta Crystallographica (or possibly Zeitschrift fuer Kristallographie). In 1970 the world was very simple.

And then it changed. I remember in about 1972 getting a transfer of copyright form (I think from Acta Cryst, but it might have been JCS). I had no idea what it was. It was explained that this was to protect the authors from having their papers ripped off – that unless we gave copyright to the publisher they couldn’t act on our behalf.

At that stage we trusted publishers implicitly. Because they weren’t publishers. They were learned societies that we belonged to – paid membership to. That represented our interests because they were composed of us. Why would you not trust them?

Turning over our copyright was the biggest mistake that academia has made in the last 50 years. Because we handed over our soul. We didn’t even sell our soul – we gave it away. Was there an ulterior motive then? I’d like someone to tell me – I honestly don’t know whether it was a genuine idea or whether it was a con.

If we had the internet we would never have been ignorant of the issues. OKF or ORG etc. would have made it immediately clear that this was not necessary and could lead to disaster (as it has). But there was little communication – where do you look? The THES? No email, no blogs, no twitter…

And then – in about 1974 IIRC seeing Tetrahedron (or maybe Tetrahedron Letters) – a Pergamon Press journal. Pergamon was run by Robert Maxwell. It had an appealing visual quality – higher than the society journals. And it concentrated on one subject only – organic chemistry (whereas JCS and the other society journals had all subdisciplines of chemistry). It was irrelevant to me that it was commercial – I didn’t pay the bills and anyway universities had lots of money and could buy almost any journals they wanted. I’ve published in Tetrahedron and TetLetts. Why not?

And I remember going to Switzerland and when I go interesting and important results finding that the convention was to publish them in J. Am. Chem. Soc., not Helvetica. The first time that the choice of journal mattered. But that was because more people would read JACS than Helvetica. I didn’t feel any sense of choosing JACS because it was “better”, just that it would be better for my work.

The 1970s and 1980s had a strange step forwards and backwards – camera-ready copy. Not in most journals but in many monograph chapters. It was a quick, and I think honest, approach. We could say and draw roughly what we wanted as long as it kept within a square blue rectangle. You were responsible for your own diagrams and spelling. It wasn’t pretty but it was rapid (relatively).

And I was involved in setting up a new society (Molecular Graphics Society, 1981) which had its own journal. It was free to members. The society subsidized the journal. Throught membership. And yes, we made money out of meetings by charging fees for exhibitors. I was treasurer. We were financially viable.

And then the web came – 1993. I thought it would transform publishing. It was an opportunity for the universities to show what their publishing houses could do. It was an unparalleled opportunity for a new type of scholarship. I ran the first multimedia course on the Internet (Principles of Protein Structure). They were heady days. A few people believed – for me Birkbeck and Nottingham. But generally academia was totally disinterested in the new opportunities. Why? Please tell me.

They could and should have taken charge of scholarly publishing. Instead they let (and encouraged) commercial publisher to dictate to them what publishing was and was to become?

  • Who asked for PDF? Not me and no-one I have talked to.
  • Who asked for double-column PDF? Not me and no-one I have talked to.
  • Who asked for the “paper” to remain fossilized as a paper image?
  • Who asked for the printing bill to be transferred from the publisher to the department laserjet?
  • Who asked for manuscripts to be submitted through inhuman forms and grotesque procedures?

No-one. Academia has supinely accepted anything that the publishers have offered them. And paid whatever they have asked. (Yes, you may occasionally think you have saved money, but look at the publishers’ revenue – a monotonically increasing function (maths speak for something that increases year by year inexorably). In most industries innovation and scale have cut prices. …

…stop. I was meant to reminisce, not rant. I’ll fondly remember up to about 1990. Then it all goes wrong.


What’s wrong with scholarly publishing? The size of the problem

Wednesday, July 13th, 2011

In previous posts ( and immediate backtracks) I have started to address the question of what is wrong with scholarly publishing. I haven’t actually established yet that there *is* anything wrong and I’ll do that in a day or two hence (symptoms and causes).

What is the size of the global scholarly research industry? What is the world GDP of academia? I have asked this question to many people without an answer. I’ll explain what I mean…

Money is given publicly to institutions (mainly universities but also local, national and international research institutions (STFC, CSIRO, national labs…) , including charities (e.g. Cancer Research UK)) to carry out research. I am restricting this to research work, not private contract work (e.g. work for hire that is unlikely to be published) and excluding teaching or other non-research activities. I also exclude work within for-profit companies (e.g. Glaxo Group Research (now GSK) where I used to do research). There is an expectation that this work will be “published” or “made public” – here I don’t address what this means – I shall later. The money is usually publicly accountable and may even be published. It includes funding to academia from for-profits where the contract is for “research” – this often means that the results are expected to be “published” and there is often a reduced overhead (fee) from the institution. (For example we have had funding from Microsoft and Unilever, some pharma and some for-profit publishers). The ethics of this is not in question here – I am simply establishing the scale. The point is that this “academic industry” – and such it is – is coupled to scholarly publishing in a bizarre manner, and one which I shall argue is deeply unhealthy.

So I am going to conflate terms and use “academia” to mean the institutions above. Companies (such as Glaxo and Microsoft) ultimately rely on sales and stock price for their measure of worth. Scholarly research increasingly relies on publication metrics.

So how large is academia? I find it very hard to get figures (and that is the value of a blog – I hope that some readers can help). I am happy if the figures are within half an order of magnitude – a factor of 3 either way.

I come from these directions:

  • When the Wellcome Trust fund research they allow about 1-2% for publishing. Scholarly publishing is about 10 billion /year (GBP, USD, Eur … the units are lost in the noise). So the associated research is 50-100 times higher => 500-1000 billion
  • The top universities (Cambridge, Stanford, Harvard) get about 500 million/year. There are probably about 10,000 academic institutions (with a long tail). Truncate the tail at 1000 and we might get 500 * 1000 => 500 billion

(There are limits – research is much greater than scholarly publishing and is less than the GDP of the planet). So let’s assume 500 billion.

That’s a large industry. Most industries of that size have developed an information infrastructure (e.g. for suppliers, for metrics, for government). Academia has not. Academia has let others produce information products which they then buy. Unlike some industries which regulate their information infrastructure (think supermarkets) academia lets others do this.

This has a cost – a serious cost. There is a direct cost in the information products. If we (i.e. academia) wish to get information on scholarly output (mainly scholarly publishing) we have to pay others for their information products. We have not designed these information products, nor – as far as I know, have we challenged their design and content – we take them as givens. But this (perhaps 10 billion) is not the major problem.

It gives rise to the much more serious cost – we make decisions based on information over which we have no control. The irony is that much of this basic information – the scholarly publications – is initially produced by us – in electronic form. Any competent industry would immediately use this information itself –in the overall picture it’s a tiny fraction of 500 billion (a concerted world-wide effort in academia would create at-source metrics for a few billion at most).

A feature of academia is that it is a Holy Roman Empire of thousands of players. Each tries to solve these problems by itself. In the UK every university has to create its own system for the upcoming REF (assessment exercise). Whether you think the REF is a good thing or not it seems certain that it does not compare to the competence that would be found in an industry. Yes, industry can foul up on IT and frequently does, but academia usually doesn’t even get started. Taking the axiom that the UK wishes to measure 100 institutions in the REF it seems extraordinarily inefficient to expect each to create its own information system.

The vacuum of a proper information infrastructure for the world-wide academic industry is exacerbated by the apparent need for every institution to compete aggressively against every other. In most industries this is tackled by mergers and acquisitions. When I worked in Glaxo, Richard Sykes (CEO, and then Rector of Imperial) argued that in most businesses the market leader was about 30%. And that in pharma the largest was 5 % (Glaxo). (So he went out and bought Wellcome). In universities I suspect the leader is about 0.1%. I am not saying universities should merge – I am arguing that because there is a plethora of competing institutions then the information infrastructure is archaic and exploited.

The malaise in scholarly publishing is directly of academia’s own making. We have failed to notice, let alone adjust, our own business processes with the results that others are doing it for us. And not in response to our needs but to what benefits their markets.

And in the holy market economy this is regarded as a good thing. The fault, dear Brutus, is that we have been sleeping for about 30 years and have not wakened to the fact that we are Gulliver-like tied and restricted. But if we work together on this we are vastly the largest player in the marketplace. In principle we can collectively shape or information infrastructure, especially scholarly publishing, to whatever we want.

It is not too late, but it is getting that way. I am always grateful for feedback. My next sortie, unless feedback takes me elsewhere, will be to examine the symptoms of the dystopia.

What’s wrong with scholarly publishing? Your feedback – Why should journals exist?

Monday, July 11th, 2011

One of the features of blogging is that you get immediate feedback – some positive, some not. ALL feedback is welcomed and will be treated professionally. In conventional scholarly publication we are expected to assemble other relevant work, prior art, conflicts, etc. The blog makes this easy. If I have omitted a significant opinion or piece of work then I am likely to be informed of this. Here’s an example – I’ll reproduce the first part as it is essentially a scholarly publication, but in the form of a blog post… (Daniel Mietchen is active in OKF and ran the Open Theses earlier this year.) Thanks to Adrian Pohl (Open Bibliographic Principles fame)…

Mietchen, Pampel & Heller: Criteria for the Journal of the Future

The internet changes the communication spaces in which scientific discourse takes place. In this context, the format and role of the scientific journal are changing. This commentary introduces eight criteria that we deem relevant for the future of the scientific journal in the digital age.


The debate on the future of scholarly communication takes place between researchers, librarians, publishers and other interested parties worldwide. Perhaps appropriate to the topic, the debate has seen relatively few contributions via traditional scholarly communication channels, whereas blog posts like “Is scientific publishing about to be disrupted?” by Michael Nielsen (2009) received a lot of attention.

In light of this debate, a discussion emerged during the Open Access Days 2009 between Lambert Heller and Heinz Pampel about the changing landscape of scholarly communication in the field of library and information science (LIS). In the following months, both discussed their views with different stakeholders, including the LIBREAS editors.

In autumn 2010, Heller and Pampel started – a blog, in which they document their thoughts on the current system and on the future of scientific discourse in LIS. [1] As a result, they summarised their analysis in a paper (Heller & Pampel 2010) presented at the annual conference of the German Society for Information Science and Information Practice (DGI). The core of the work is a collection of eight criteria for the future of the scientific journal in LIS.

In connection with a conference talk by Daniel Mietchen on large-scale collaboration via web-based platforms at the conference “Digitale Wissenschaft 2010″ in Cologne (Mietchen 2010a), Mietchen and Pampel discussed the possibility of a transition of the criteria in a general and interdisciplinary form.

In the following, Mietchen translated the criteria into English and started an editable copy thereof at Wikiversity, a wiki for the creation and use of free learning materials and activities (Mietchen 2010b).

After a further joint discussion, the following version [2] of the criteria was formulated, with contributions from other discussants at Friendfeed [3] and Wikiversity.


Dynamics: Research is a process. The scientific journal of the future provides a platform for continuous and rapid publishing of workflows and other information pertaining to a research project, and for updating any such content by its original authors or collaboratively by relevant communities.

Scope: Data come in many different formats. The scientific journal of the future interoperates with databases and ontologies by way of open standards and concentrates itself on the contextualization of knowledge newly acquired through research, without limiting its scope in terms of topic or methodology.

Access: Free access to scientific knowledge, and permissions to re-use and re-purpose it, are an invaluable source for research, innovation and education. The scientific journal of the future provides legally and technically barrier-free access to its contents, along with clearly stated options for re-use and re-purposing.

Replicability: The open access to all relevant core elements of a publication facilitates the verification and subsequent re-use of published content. The scientific journal of the future requires the publication of detailed methodologies, including all data and code, that form the basis of any research project.

Review: The critical, transparent and impartial examination of information submitted by the professional community enhances the quality of publications. The scientific journal of the future supports post-publication peer review, and qualified reviews of submitted content shall always be made public.

Presentation: Digitization opens up new opportunities to provide content, such as through semantic and multimedia enrichment. The scientific journal of the future adheres to open Web standards and creates a framework in which the technological possibilities of the digital media can be exploited by authors, readers and machines alike, and content remains continuously linkable.

Transparency: Disclosure of conflicts of interest creates transparency. The scientific journal of the future promotes transparency by requiring its editorial board, the editors and the authors to disclose both existing and potential conflicts of interest with respect to a publication and to make explicit their contributions to any publication.

If you read the criteria they are fairly similar to mine two days ago (I think the first are concerned with the how? Rather than why?). Bjoern Brembs (whose talk at OKF has informed me on Impact Factors) has just posted:

July 11, 2011 at 12:28 pm  (Edit)

I’m with you on journals needing to go extinct. The only reason they’re still around is history. So back to history they ought to go.

So I take MPH’s points on board but think they should be part of the “publication of the future”, not the journal.

So, champions of journals (I assume there are some) please let us have your arguments PRO journals. If the reasons are branding and competition, please say so. They will be given equal space on this blog.


What is wrong with Scientific Publishing: an illustrative “true” story

Sunday, July 10th, 2011

Yesterday I abandoned my coding to write about scientific publishing:

and I now have to continue in a hopefully logical, somewhat exploratory vein. I don’t have all the answers – I don’t even have all the questions – and writing these posts is taking me to new areas where I shall put forward half-formed ideas and await feedback (“peer-review”) from the community. The act of trying to express my ideas formally, for a critical audience, is helping to refine them. And I am hoping that where I am struggling for facts or prior scholarship that you will help. That’s not an excuse for laziness , it’s a realization that one person cannot address this problem by themselves.

This blog post *is* a scholarly publication. It addresses all the points that I feel are important – priority (not that this is critical), peer review, communication, re-use (if you want to), and archival (not perhaps formal, but this blog is sufficiently prominent that it gets cached. This may horrify librarians, but it’s good enough for me).

The only thing it doesn’t have is an ISI impact factor, and I’ll return to that. It does have measures of impact (Technorati, Feedburner, etc.) which measure readership and crawlership. (These are inaccurate – they recently dropped by a factor of 5 when the blog was renamed – I’d be interested to hear from anyone who cannot receive this blog for technical reasons (timeout, etc.)). Feedburner suggests that a few hundred people “read” this blog. There’s also Friendfeed ( ) where people (mainly well-aligned) comment and “like” posts; and Twitter where I have 650 followers (Glyn Moody has 10 times that) – a tweet linking to yesterday’s post has just appeared.

So the blog post fulfils the role of communication – two way communication – and has mechanisms for detecting and measuring this. As I write this I imagine the community for whom I am preparing these ideas and from whom I am hoping for feedback. Ambitiously I am hoping that this could become a communal activity – where there are several authors. (We do this all the time in the OKF – Etherpads, Wikis, etc.) And who knows, this document might end up as part of a Panton Paper. As you can tell I am somewhat enjoying this, though writing is often painful in itself.

I am going to describe new ideas (at least for me) about scholarly publishing. I am going to use “scholarly” as inclusive of “STM” and extending to other fields – because in many cases the translation is direct; where there are differences I will explicitly use STM. I like the word “scholarly” because it highlights the importance of the author (which is one of the current symptoms of the malaise – the commoditization of authorship). It also maps onto our ideas of ScholarlyHTML as one of the examples of how publication should be done.

Before my analysis I’ll give an example of the symptoms of the dystopia. This has reinforced me in my determination never to publish my ideas in a traditional “paper” for a conventional journal. Details are slightly hazy. I was invited – I think in 2007 – to write an article as part of an Issue on the progress of Open Access. Here it is

Serials Review
Volume 34, Issue 1, March 2008, Pages 52-64

Open Data in Science

Peter Murray-Rusta,

aMurray-Rust is Reader in Molecular Informatics, Unilever Centre for Molecular Sciences Informatics, Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, UK


It will cost you 40 USD to rent it for ONE DAY. You are allowed to print it for personal use during this period.

*I* cannot read my own article and I do not have a copy.

The whole submission process was Gormenghastian and I have ended up being embittered by it. I asked for the article to be Open Access (Green) and believed that it would be available indefinitely so that I would not have to take a “PDF copy” (which is why I don’t have one). When I discovered that I could not read my own article I contacted the publishers and was told that I had agreed to it being Open for a year after which it would be closed. Maybe – I don’t remember this but there were 100+ emails and it may have slipped my unconscious mind. If I had been conscious of it, I would never have acquiesced. It’s a bizarre condition – let people read something and then cut them off for ever. It has no place in “scholarly communication” – more in the burning of the libraries.

I took the invitation as an exciting opportunity to develop new ideas and to get feedback, so I wrote to the author (whom I only know throw email) and explained my ideas. (If I appear critical of her anywhere it is because I am critical of the whole system). I explained that “Open data” was an exciting topic where text was inadequate and to show this I would create an interactive paper (a datument) with intelligent objects. It would give readers an opportunity to see the potential and challenges of data. This was agreed and I would deliver my manuscript as HTML. I also started the conversation on Openness of the resulting article. The only positive thing was that I established that I could post my pre-submission manuscript independently of Elsevier. (I cannot do this with publishers such as the American Chemical Society – they would immediate refuse the article). I decided to deposit it in “Nature Precedings” – an imaginative gratis service from NPG. . This manuscript still exists and you can copy it under CC-BY and do whatever you want with it. (No, there is no interactive HTML for reasons we’ll come on to).

I put a LOT of work into the manuscript. The images that you see are mainly interactive (applets, SVG, etc.). Making sure they all work is hard. And, I’ll admit, I was late on the deadline. But I finally got it all together and mailed it off.

Disallowed. It wasn’t *.doc. Of course it wasn’t *.DOC, it was interactive HTML. The Elsevier publication process refused to allow anything except DOC. In a rush, therefore,

I destroyed my work so it could be “published”

I deleted all the applets, SVG, etc. and put an emasculated version into the system and turned to my “day” job – chemical informatics – where I am at least partially in control of my own output.

I have never heard anything more. I got no reviews (I think the editor accepted it asis). I have no idea whether I got proofs. The paper was published along with 7 others some months later. I have never read the other papers, and it would now cost me 320 USD to read them (including mine). There is an editorial (1-2 pages which also costs 40 USD). I have never read it, so I have no idea whether the editor had any comments.

Why have I never read any of these papers? Because this is a non-communication process. If I have to wait months for something to happen I forget. *I* am not going to click on Serials Review masthead every day watching to see whether my paper has got “printed”. So the process guarantees a large degree of obscurity.

Have I had any informal feedback? Someone reading the article and mailing me?


Has anyone read the article? (I include the editor). I have no idea. There are no figures for readership.

Has anyone cited the article?

YES – four people have cited the article! And I don’t have to pay to see the citation metadata :

The dereferenced metadata (I am probably breaking copyright) is

1 Moving beyond sharing vs. withholding to understand how scientists share data through large-scale, open access databases 
Akmon, D. 2011 ACM International Conference Proceeding Series , pp. 634-635 0

2 Advances in structure elucidation of small molecules using mass spectrometry 
Kind, T., Fiehn, O. 2010 Bioanalytical Reviews 2 (1), pp. 23-60 2

3 An introduction to data mining 
Apostolakis, J. 2010 Structure and Bonding 134 0

4 Data mining in organic crystallography 
Hofmann, D.W.M. 2010 Structure and Bonding 134 0

I cannot read 3 of these (well it would cost ca 70 USD just to see what the authors said), but #2 is Open. Thank you Thomas (I imagine you had to pay to allow me to read it) [Thomas and I know each other well in cyberspace]. It is clear that you have read my article – or enough for your purposes. Thomas writes

that once data and exchange standards are established, no

human interaction is needed anymore to collect spectral

data [525]. The CrystalEye project ( shows that the aggregation of crystal

structures can be totally robotized using modern web

technologies. The only requirement is that the spectral data

must be available under open-data licenses (http://www. [544].

The other three may have read it (two are crystallography publications) or they may simply have copied the reference. It’s interesting (not unusual) to see that the citations are 2 years post publication).

So in summary, the conventional publication system consists of:

  • Author expends a great deal of effort to create manuscript
  • Publisher “publishes it through an inhuman mechanistic process; no useful feedback is given
  • Publisher ensures that no-one can read the work unless…
  • University libraries pay a large sum (probably thousands of dollars/annum each) to allow “free” access to an extremely small number of people (those in rich universities perhaps 0.0001% of the literate world – how many of you can read these articles sitting where you are?)
  • No one actually reads it


In any terms this is dysfunctional – a hacked off author, who has probably upset an academic editor, and who have jointly ensured that the work is read by almost no-one. Can anyone give me a reason why “Serials Review” should not be closed down and something better put in its place? And this goes for zillions of other journals.

Hang on, I’ve forgotten the holy impact factor… ( )

Impact Factor: 0.707

Yup, roughly the square root of a half.

What will my colleagues say?

My academic colleagues (will unfortunately) say that I should not publish in journals with an IF of less than (??) 5.0 (J Cheminfo is about 3). That in itself is an appalling indictment – they should be saying “Open data is an important scholarly topic – you make some good points about A,B,C and I have built on them; You get X, Y wrong and you have completely failed to pay attention to Z.”

My Open Knowledge and Blue Obelisk colleagues will say – “this is a great start to understanding and defining Open Data”.

And I can point to feedback from the gratis Nature Precedings: ( )

This has:

  • 11 votes (whatever that means, but it probably means at least 11 people have glanced at the paper)
  • A useful and insightful comment
  • And cited by 13 (Google) (Scopus was only 4). These are not-self citations.

So from N=1 I conclude:

  • Closed access kills scholarly communication
  • Conventional publication is dysfunctional

If I had relied on journals like Serials Review to develop the ideas of Open Data we would have got nowhere.

In fact the discussion , the creativity, the formalism has come through creating a Wikipedia page on “Open data” and inviting comment. Google “Open Data” and you’ll find at the top. Google “Open data in science” ( ) and the gratis manuscript comes top (The Elsevier article is nowhere to be seen).

As a result of all this open activity I and other have helped to create the Panton Principles ( ). As you will have guessed by now I get no academic credit for this – and my colleagues will regard this as a waste of time for a chemist to be involved in. For me it’s true scholarship , for them it has zero “impact”.

In closing I should make it clear that Open Access in its formal sense is only a small advance. More people can read “it”, but “it” is an outdated, twentieth century object. It’s outlived its time. The value of Wikipedia and Nature Precedings for me is that this has enabled a communal journey. It’s an n<->n communication process rooted in the current century.

Unless “journals” change their nature (I shall explore this and I think the most valuable thing is for them to disappear completely) then the tectonic plates in scholarly publishing will create an earthquake.

So this *is* a scholarly publication – it hasn’t ended up where I intended to go, but it’s a useful first draft. Not quite sure what – perhaps a Panton paper? And if my academic colleagues think it’s a waste, that is their problem and, unfortunately, our problem.

[And yes – I know publishers read this blog. The Marketing director of the RSC rang me up on Friday as a result of my earlier post. So please comment.]




What is wrong with Scientific Publishing and can we put it right before it is too late?

Saturday, July 9th, 2011

I sat down today to write code and and found that I couldn’t – I had to write about science publishing, so here goes. I intend this will be the first of several posts. I often blog in forceful style (rant?) but here will try to be as objective as possible. I’d like to start a discussion and engage responsible STM publishers. I’d like to see if we can define what the basis of publishing is. Why? And how?

But I am going to start with a strong assertion. STM publishing is seriously broken and getting worse. It is being driven by forces largely outwith the directing influence of the scientific community (although not necessarily outwith their ultimate control). This is manifested by activities which have nothing (in my view) to do with science, and I will explain that.

A brief topical aside. Non-UK readers may not realize the enormity of what has happened in the UK and what the lesson is for scientific publishers. The News Of the World – a popular UK newspaper – broke the law repeatedly by phone-hacking of victims of crime. Public outrage exploded and with 24 hours a 150-year old newspaper had ceased to be. That is the power of the masses – it is too rarely exercised – but when it happens it can be unstoppable. The “public” had existed in a cosy, if unpleasant, symbiosis with the publisher, eagerly demanding new salacious material and paying for it. But when the newspaper overstepped … a bang, not a whimper! There were no discussions, no slow decline. A week ago there were the usual rumblings, but no one predicted this – at least in public. The power of the crowd in a media-literate society is frighteningly rapid. The same fate can await complacency in STM.

That is the potential power that the scientific and academic community has over scholarly publishers. (In this post I am going to restrict discussion to serials publishers in STM). I’ll state the simple premise:

  • Unless the process of scientific publication is rapidly and effectively revised there will be a catastrophic crash. It will be unpredictable in both its timing, speed and nature. It will destroy some of the current participants. It will change parts of the scientific process and will change academia.

I have no special knowledge so that’s a Cassandra-like statement (although I have no wish to play that role). I am surprised how few of my general colleagues (e.g. not the OKF) share my concerns about the state of STM publishing. They do not realise the dystopia we are already in and its apparently inexorable progress.

Before you switch off from this analysis, I intend to offer constructive dialogue to all parties. I know publishers read this blog (I was rung up yesterday by the Marketing Director of the RSC in response to yesterday’s blog.) I wish, honestly and constructively to analyse, the benefits that STM publishers can provide. Some of them do provide good services to science, but I find it difficult to see value from many others. They have the chance, if they wish to answer some (I hope) objective questions.

Similarly I have been critical of academic libraries, but do not see them as the cause. They should have altered us earlier to problems instead of acquiescing to so much of the dystopia. They are part, but only part, of the solution.

I have therefore come, perhaps belatedly, to the conclusion that the crisis is of our (academia’s) making. I used to blame the publishers and I still can and will when appropriate. (The manufacture and sale of fake journals is inexcusable – as bad as Murdoch’s phone hacking). But the publishers are a symptom of our disease, not the cause. Cassius says:

Men at some time are masters of their fates:
The fault, dear Brutus, is not in our stars,
But in ourselves, that we are underlings.

The academic system (in which I include public funders) has, by default, given away a significant part of its decision-making to the publishing industry. (I use “industry” to include non-profits such as learned societies, and like all industries there are extremes of good and bad practices). This gifting has been done gradually, over about 2 decades, without any conscious decisions by academia, and without – in the beginning – any conscious strategy from the publishers. The gifts have all been oneway – from academia to industry, which has grown in both wealth and power at the expense of academia. In effect academia has unconsciously stood by, dreaming, during the creation of a 10 billion USD industry, almost all of whose revenues come from academia, frequently to their detriment. Like Morbius in Forbidden Planet we have created our own monsters.

So I will start with some axioms, on which future posts may build. If we can all agree then this serves as a basis for future decision making

  • Science and scientists have a need and a duty to publish their work.
  • Funders rightly and increasingly require this in a formal manner.
  • This work should be available to everyone on the planet. Ideally the costs incurred in doing so should be invisible to the reader.
  • The purpose of publication in whatever degree of formality is:
  1. To establish priority of the work
  2. To communicate the work to any who wishes to consume it
  3. To offer the work for formal and informal peer-review and to respond to discourse
  4. To allow the work to be repeated, especially for falsifiability
  5. To allow the work to be built on by others
  6. To preserve the work

I’d like to formalize this list – it’s a first draft and I want to make sure we haven’t omitted anything. I’d also like to know from any party, especially a publisher, if they disagree. There are publishers, for example, who believe that part of the process of publication is to restrict access.

I will say again; let us be careful because this rather enticing statement that everybody should be able to see everything could lead to chaos. Speak to people in the medical profession, and they will say the last thing they want are people who may have illnesses reading this information, marching into surgeries and asking things. We need to be careful with this very, very high-level information. (Dr John Jarvis, Senior Vice President, Europe, Managing Director, Wiley Europe Limited) examined by Ian Gibsons select Committee in the House of Commons, Westminster, UK, 2004-03-01) (

I hope that 7 years have removed this attitude.

The historical purposes of publication did not include bibliometric evaluation of the publication as a means of assessing scientists or institutions. This is the monster we have allowed to be born and which we must now control. I do not believe it should be part of the formal reasons for publication. And if it retreats to informality we should take formal steps to control it.

So I’d be grateful for reactions, in the comments section. I will not edit and will attempt to keep comments objective.

PLoS One, Text-mining, Metrics and Bats

Friday, July 8th, 2011

Just heard that PLoS One was awarded Innovator of the Year by SPARC:

I applaud them personally as the 4 Pantonistas were given the same award last year for the Panton Principles.

So Lezan, collaborators at NaCTEM and I have published our first article in PLoS:

Using Workflows to Explore and Optimise Named Entity Recognition for Chemistry

Top of Form

Bottom of Form

BalaKrishna Kolluru1*, Lezan Hawizy2, Peter Murray-Rust2, Junichi Tsujii1, Sophia Ananiadou1

For those who don’t know, PLoS One publishes competent science. Not trendy science, not stuff that the marketeers think will sell the journal. Just plain competent science. Simply:

  • We have said we have done X,Y,Z
  • It is in scope for the journal (a lof of science is out of scope for PLoS one
  • The referees agreed that we had done X,Y,Z competently. No “this isn’t interesting”, “not sufficient impact”.

To be honest it took an AWFUL long time to get it reviewed. SIX MONTHs (look at the dates) to get two referees opinions. I doubt this is specific to PLoS, it’s the fundamental problem of refereeing, to which there is no good answer.

Anyway it has been out for a few weeks? What does the world think of it? Well it has been out about 6 weeks and had 316 downloads. That’s’ exciting to young scientists. 300 people have clicked on their article. (Maybe they haven’t READ it, but at least it’s an indication). And another of Lezan’s papers has got a “Highly accessed” in J. Chem Informatics (

475 Research article    
ChemicalTagger: A tool for semantic text-mining in chemistry
Lezan Hawizy, David M Jessop, Nico Adams, Peter Murray-Rust
Journal of Cheminformatics 2011, 3:17 (16 May 2011)
[Abstract] [Full Text] [PDF] [PubMed] [Related articles]

Well I am not a great fan of metrics of any sort. We have ahd 300,000 downloads of our software (official Microsoft figures) and we get zero credit. But at least we have a few hundred downloaders. So is 300 good? Impossible to say, but I’ll have a little fun with metrics:

Lets’ go to PLoS on May 27 and see the other article downloads. They’re 512, 322, 511, 295, 458, 493, 398 … So Lezan and Bala are within the range. Good, competent, science. (Text-mining science cannot be trendy because if we actually try to do it we’ll be sued by the published for mining “their” content – it is deeply depressing to be prevented from doing science by lawyers).

So what’s the sort of access for a highly accessed article? Go to and “most viewed” and there are articles 1 week old with several thousand views. What the record? This one about bats:

It’s just under 2 years and had nearly 200,000 accesses. If it were a physical journal it would have fallen apart.

It’s had about 2 citations, which shows how stupid these metrics are

“download” it and see why it’s popular. You might even read it (I did, briefly)

But of course that will distort the metrics. Open access encourages people to READ articles. Whereas articles are actually only meant to be cited, not read.