UPDATE. I got feedback suggesting that part of principle 2 was inappropriate at this stage and I agree. So I have struck through parts in this post. There is merit in changing emphasis at such an early stage in the process. This document is subject to revision – that's part of the point of open discussion.
We – in the OKFN – have been spending some time on Etherpads and skype putting the principles of Open Content Mining. Yesterday we met on skype and decided that we'd done sufficient to take this to the world and get feedback and enhancement. Naomi Lillie (OKFN) will post the full version later and Peter Suber will link to it. This blogpost is an introduction and I'll quote the central points.
Kinder Scout (from Wikipedia) Fuaigh Mòr (Wikipedia)
Let's start with another historic area of rights – the right to roam. This is a 20th C movement in many countries to assert that everyone has access to land, whether or not it is privately owned. It's a good analogy. The fundamental ownership of land id critical, political and often poorly defined. In the 18/19th Century Scotland suffered the http://en.wikipedia.org/wiki/Highland_Clearances - where the residents of the land were thrown out – killed, emigrated, - the lands "improved" with sheep and the lands now "belong" to landlords. But there is a traditional right of access to these lands regardless of actual "ownership". Wikipedia (http://en.wikipedia.org/wiki/Freedom_to_roam ) says:
The freedom to roam, or everyman's right is the general public's right to access certain public or privately owned land for recreation and exercise. The right is sometimes called the right of public access to the wilderness or the right to roam.
Not everyone shares the same view as to what these rights are or even whether they exist. I have been thrown off Scottish land by a gamekeeper with a shotgun, even where there was a legal right. But just because not everyone agrees on the rights doesn't mean they don't exist.
So we believe that there is a right to mine the scientific literature and we have expressed this as:
The right to read is the right to mine.
That's our assertion of the fundamental rights. In the 20th Century the people asserted their right to roam. We are asserting the people's right to mine. This is a simple political statement – like "everyone has a right to a fair trial". Because the publishers[*] – like the 19th C landowners dispute this right we have to fight for it. The UK has had a series of fights for rights including freedom of speech, trial by jury, freedom from slavery, etc. Sometimes people went to jail, sometimes they died for these.
But we must fight. An extremely relevant example is the mass trespass at Kinder Scout (http://en.wikipedia.org/wiki/Mass_trespass_of_Kinder_Scout) , WP:
The mass trespass of Kinder Scout was a notable act of willful trespass by ramblers. It was undertaken at Kinder Scout, in the Peak District of Derbyshire, England, on 24 April 1932, to highlight that walkers in England and Wales were denied access to areas of open country. Political and conservation activist Benny Rothman was one of the principal leaders.
The trespass proceeded via William Clough to the plateau of Kinder Scout, where there were violent scuffles with gamekeepers. The ramblers were able to reach their destination and meet with another group. On the return, five ramblers were arrested, with another detained earlier. Trespass was not, and still is not, a criminal offence in any part of Britain, but some would receive jail sentences of two to six months for offences relating to violence against the keepers.
The mass trespass marked the beginning of a media campaign by The Ramblers Association, culminating in the Countryside and Rights of Way Act 2000, which legislates rights to walk on mapped access land. The introduction of this Act was a key promise in the manifesto which brought New Labour to power in 1997.
So it's a long struggle. Am I suggesting a Mass Trespass of publishers? That may depend on readers. But the same tensions are there as 80 years ago – an unjust control of access and the need to change the system by breaking the law. And we have a long tradition of noble lawbreaking – often it is the only way that we change minds and therefore laws. There is usually a debate as to whether change should come by legal means or – in today's language – "occupying" and "pirate" action.
So I repeat:
The right to read is the right to mine.
This isn't a negotiated position. It's not a summary of current practice. It's a statement of a fundamental right that we must fight for.
Yesterday we agreed that we could not at this stage list the "how" of Open Content Mining (OCM). That comes later. It will probably be filled with subjunctive clauses – this is a difficult and complex area. The right to roam has to yield to national security and rare species. It may or may not have to yield to personal privacy – a difficult area. So the right to mine will have to take account of the current law and decide what can be done within it or what needs changing (e.g. Hargreaves). It may require a definition of "fact". It may requires cases. It could take some time. But that does not mean we cannot NOW assert the right.
So here's the core of the principles. We'd welcome others being involved. But I repeat, this is not a negotiation – it's drafting something we expect to stand for decades or longer. Much of it needs commentary and redrafting – particularly IMO section 2. We don't want to rush these principles, but we do wish to kickstart the process.
Principle 1: Right of Legitimate Accessors to Mine
We assert that there is no legal, ethical or moral reason to refuse to allow legitimate accessors of research content (OA or otherwise) to use machines to analyse the published output of the research community. Researchers expect to access and process the full content of the research literature with their computer programs and should be able to use their machines as they use their eyes.
- The right to read is the right to mine.
Principle 2: Lightweight Processing Terms and Conditions
Mining by legitimate subscribers should not be prohibited by contractual or other legal barriers. Publishers should add clarifying language in subscription agreements that content is available for information mining by download or by remote access. Where access is through researcher-provided tools, no further cost should be required. The right to crawl is not the right to use a publisher's API for free, however, when access is through publisher-supplied programmatic interfaces, the fees should be transparent and per-api-call. Processing by subscribers should be conducted within community norms of responsible behaviour in the electronic age.
- Users and providers should encourage machine processing.
Immediate feedback suggested deleting part of this section and I agree.
Principle 3: Use
Researchers can and will publish facts and excerpts which they discover by reading and processing documents. They expect to disseminate aggregate statistical results as facts and context text as fair use excerpts, openly and with no restrictions other than attribution. Publisher efforts to claim rights in the results of mining further retard the advancement of science by making those results less available to the research community; Such claims should be prohibited.
- Facts don't belong to anyone.