Content-mining; Rights versus Licences

[I intend to follow with several more detailed posts.]

Last week was a critical point for those who regard the scholarly literature as a public good, rather than a business. Those who care must now speak out, for if they do not, we shall see a cloud descend over the digital century where we are disenfranchised and living in enclosures and walled gardens run by commercial mega-corporations.

Chris Hartgerink, a statistician at the University of Tilburg NL, was using machines to read scholarly literature to do research (“content-mining”). Elsevier, the mega-publisher, contacted the University and required them to stop Chris. The University complied with the publisher and Chris is now forbidden to do research using mining without Elsevier's permission.

Some reports include:

The issues are simple:

  • Chris has rightful access to the literature and can read it with his eyes.

  • His research is serious, valuable and competent.

  • Machines can save Chris weeks of time and prevent many types of error.

What Chris has been doing has been massively resisted by mainstream “TAPublishers” [1]. This includes:

  • lobbying to reject proposed legislation (often by making it more restrictive).

  • producing FUD (“Fear Uncertainty and Doubt”) aimed at politicians, libraries and researchers such as Chris. Note that “stealing” is now commonly used in TAPublisher-FUD.

  • physically preventing mining (e.g. through CAPTCHAs).

  • Preventing mining though contractual or legal means (as with Chris).

Many of us met in The Hague last year to promote this type of new and valuable research, and wrote The Hague Declaration . A wide range of organisations and individuals ranging from universities, libraries, and liberal publishers have signed. This is often represented with by my phrase “The Right to Read is the Right to Mine”.

Many reformers, led initially by Neelie Kroes (European Commissioner till 2014) and now by Julia Reda (MEP) have pushed for reforms of copyright to allow and promote mining. The European Parliament and the Commission have produced in-depth proposals for liberalising European law.

The reality is that reformers and the Publishers have little common ground on mining. Reformers are campaigning for their Rights; TAPublishers are trying to prevent this. This is often encapsulated in additional mining “Licences” proposed by TAPublishers. This is epitomised by the STMPublisher-lobbied “Licences for Europe” proposed in 2013 in Commission discussions, but which broke down completely as the reformers were not prepared.

The TAPublishers are trying to coerce the scholarly and wider community into accepting Licences; we are challenging this by asserting our Rights.

Unfettered Access to Knowledge is as important in the Digital Century as food, land, water, and slavery have been over the millenia.

The issue for Chris and others is:

  • Can I read the literature I have access to

    1. in the way I want,

    2. for any legal purpose

    3. using machines when appropriate

    4. without asking for further permission

    5. or telling corporations who I am and what I am doing

    6. and publishing the results in the open literature without constraints

Chris has the moral right to do 1-6, but not the legal right, because the TAPublishers have added restrictions to the subscription contracts, and his University has signed them. He is therefore (probably) bound by NL contract law.

In the UK the situation is somewhat better. Last year a copyright Exception was enacted which allows me to do much of this. (2) has to be for “non-commercial research” and (6) would only be permissable if I don't break copyright in publishing the results. So I can do something useful (although not nearly as much as I want to do, and as reponsible science requires). I know also that I will have constant opposition from publishers, probably including lobbying of my institution.

European reformers are pushing for a similar legal right in Europe and many propose removing the “non-commercial” clause. There is MASSIVE opposition from publishers primarily through lobbying, where key politicians and staff are constantly fed the publishers's story. There is no public forum (such as a UK Select Committee) where we can show the fallaciousness of TAPublisher arguments. (This is a major failing of European democracy – much of it happens in personal encounters with unelected representatives who have no formal responsibility to the people of Europe). The fight – and it is a fight – is therefore hugely asymmetric. If I want to represent my views we have to travel to Brussels at our own expense – TAPublishers have literally billions.

The issue is RIGHTS (not APIs, not bandwidth, not cost, not FUD about server-load, not convenience)


I hope you feel that this is the time to take a stand.

What can we do?

Some immediate and cost-free tasks:

  • Sign the Hague Declaration. Very few European University / Libraries have so far done so

  • Write to your MEP. Feel free to take this mail as a basis, but personalise it

  • Write to Commissioner Oettinger (“Digital Single Market”)

  • Write to your University and your University Library. Use Freedom of Information to required that they reply. Challenge the current practice

  • Alert your learned socity to the muzzling of science and scholarship.

  • Alert organizations who are campaigning for Rights in the Digital age.

  • Tweet this post, and push for retweets

And think about what ContentMining could do for you. And explore with us.

And what are PMR and colleagues going to do?

Because I have the legal right to mine the Cambridge subscription literature for non-commercial purposes, I and colleagues are going to do that. Ross Mounce and I have already shown that totally new insights are possible (see We've developed a wide range of tools and we'll be working on our own research and also with the wider research community in areas that we can contribute to.

[1]. There is a spectrum of publishing ranging from large conventional, often highly profitable, publishers through learned societies, to new liberal startups in the last 10 years. I shall use “TAPublisher” (TollAccess publisher) to refer to publishers such as (but not limited to) Elsevier, Wiley, Springer, Macmillan, Nature . They are represented by an association (STMPublishers Association) which effectively represents their interests and has been active in developing and promoting licences..

This entry was posted in Uncategorized. Bookmark the permalink.

One Response to Content-mining; Rights versus Licences

  1. Pingback: Content-mining; Rights versus Licences – ContentMine

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>