petermr's blog

A Scientist and the Web


My Response to the UK parliament and BIS on Open Access; keep the CC-BY policy

I have responded to the UK’s BIS in its call for Open Access. Note that this takes a hell of a lot of energy – middle of the night in New Zealand after 10 days of commitment to materials Science Informatics. It’s incredibly draining of energy. Yet we have to keep going.

It’s appalling that academia shows so little interest in defending its digital rights and future. Much of the action comes from people outside mainstream academia while almost all Vice Chancellors and their senior staff are silent. If Universities *wanted* to take control of publishing they could. It’s their inaction which has got us in this mess.

It’s also draining that we have a continuous barrage of criticism of the RCUK policy from non-scientists. I’m all for multidomain debate but I’m getting tired of being told by non-scientists what scientists think and what’s best for them.

In my submission (below) I argue that the RCUK should be supported. They have been “attacked by the Green lobby because their policy is an unnecessary waste of money”. The problem is that we are in such as mess that no effective action can be taken unless we spend money or introduce a totalitarian economy. All current forms of publication with commercial publishers are broken.

Green is broken because it depends on libraries paying whatever subscriptions the publishers demand. Mandates, boycotts over the last 10 years have been unsuccessful. There is no end goal. There will be publishers like ACS who will self-destruct before they allow unpaid Green. There is no evidence that Green will lower total subscriptions – only a shortage of funding will do that.

Gold is broken because there is no proper market. Publishers can charge what they like. It also cannot become universal as so many sectors cannot afford APCs.

Hybrid has the worst of all worlds. It is certainly a waste of money.

The solution is for academics to change publishing and its values. Tim Gowers and other mathematicians are doing this – they will publish at cost in ArXiV (< 10 USD) and overlay the reviewing and journals. That is technically possible in all subjects except most academics are not prepared to do it and want someone to pay someone else to do the hard work.

Anyway here is my submission. I give thanks that the UK is taking a lead. I am more fearful of the restrictive practices being lobbied in Brussels where we rely on brave warriors such as PhD students (Ross Mounce) and Max Haussler (Post Doc) to argue the case while academics do nothing. [I can't be there – I am in NZ].




From (Prof) Peter Murray-Rust

Department of Chemistry

University of Cambridge, CB2 1EW, UK

I address specifically your request 2:

Rights of use and re-use in relation to open access research publications, including the implications of Creative Commons ‘CC-BY’ licences;

I write as a recently retired but still highly active academic who for many years has been researching in chemical information. I have pioneered re-use of information by machines to discover and disseminate new science (simplistically a “Google for Chemistry”). For example one of my students developed a system which could was able to interpret 70% of 400,000 chemical reactions in published patents in 4 days. This leads to a vast amount of new machine-understandable resources – indeed much of the current chemical literature could be transformed within a few weeks on a single machine.

There is an obvious benefit to mining the formal scientific literature in this way. It is higher quality and technically more feasible. Over 3 years I have asked the major publishers repeatedly for permission to mine published content and have been refused or fobbed off in different ways. I have documented some of these vain efforts in – in essence five years of my research have been stalled and I spen perhaps 30% of my time fighting publishers rather than doing science.

I have argued to the Hargreaves process that content-mining in chemistry is worth “low billions” world wide (it is very difficult to quantify a non-activity). . I am delighted that the IPO has agreed to Hargreaves recommendations.

I am on the Science Advisory Board of Creative Commons. Their licences are a key tool – and CC-BY is precise and precisely what is required for content-mining. Please accept that no other current licence for documents achieves the purpose of asserting cleanly that a document can be legally re-used (CC-NC is completely unsuitable). Moreover CC-BY is machine-readable. My robots can determine uniquely that a document can be mined without sending me to court. This is not possible with non-standard licences.

Note that publishers frequently assert that they are “extremely helpful and agree to almost all content mining requests”. This is not my experience nor the experience of others I speak to. It is supported by Elsevier’s own assertion that they have only granted 4 requests a year over the last 5 years. They assert that “there is little demand”; my experience is that they are so uncooperative that most people don’t bother. Moreover each researcher has to do this for every publisher – scaling to tens of thousands of requests. For this reason we need a clear automatic legal instrument.

Assuming therefore that we agree that CC-BY is essential for automatic content mining the subsidiary question is “is it worth paying for?”. There is a school of thought, almost all coming from scholars who do not practice science, that Gold CC-BY is a waste of taxpayers’ money. I concede that the current situation is deplorable – the result of complacency by universities and academics and irresponsible commercialism by publishers. We have a broken market where the only long-term solution is to transform publishers from masters of scientists into their servants. RCUK has an almost impossible problem and I think they have made a clear statement and should be strongly supported. I expect that their stance will change the balance between funders and publishers and that the costs of added Gold will drop dramatically over the coming years. By contrast Green wins nothing – we cannot mine the content and it sends signals to publishers that they can continue as usual.

Unlike some I do not feel that paying for dissemination is a waste of money. I have twice had RCUK grants specifically for dissemination (by other means) and these have been a very useful exercise for me, the University of Cambridge, and the UK. Assuming that RCUK generates a higher number of CC-BY papers these will become highly indexed by machines and thus much more highly seen and quoted. In a recent World meeting on Materials Science I highlighted open CC-BY papers in my plenary lecture to the exclusion of closed ones.

In conclusion I stress that this is a direct conflict, not a negotiation with the closed publishers. They have a 15 Billion industry and huge amounts of time and money to spend on lobbying. In contrast scientists have to divert themselves from useful activities to this constant fight against corporatism. Please, therefore, value our submissions to yourselves in this light. Note that even as I write the publishers are lobbying the EC for restrictive licences on content-mining and when I have finished this letter I have to contend in that arena as well.

JISC has shown that the benefits of open knowledge (I am on the advisory board of the Open Knowledge Foundation) will be very large. The UK has made a wonderful investment in the Open Data Institute – I am asking for permission to get chemical content to put in it.


Peter Murray-Rust



Leave a Reply