I have sent the following letter to Philip Campbell, editor of Nature. In it I request freedom to mine factual information without legal or technical barriers. We have worked closely with Timo Hannay (then of NPG) and no of Digital Science, another Macmillan company in the same building. Digital Science has great interest in published information and (maybe( uses some of our toolkit such as OPSIN.
We are making a submission in response to the Hargreaves report and specifically about the freedom to extract and publish factual information from scientific publications. I have appreciated your cooperation in the past over the requirement to publish data that supports scientific research. I have copied Timo who, as you know, has supported our research here in developing semantic informatics, including tools for extraction. This involved a summer student and in-kind support for our Sciborg (EPSRC) project. You’ll know that two of our staff have since joined Timo’s Digital Science; and we are very proud to produce valuable human resources.
We are now in the position where we can extract factual chemical information from the full text of articles with high precision and recall (OPSIN accuracy is > 99.5% and recall > 95%) and with great speed and cost-effectiveness. The University of Cambridge is a subscriber to NPG journals and we would like to begin to extract information on a systematic basis for Open scientific research. We don’t need technical help or permission from NPG. We have copied Cambridge University Library staff.
This mail is to ask your assurance that we can do this without (a) legal/contractual barriers from NPG and (b) that we shall not be cut off by NPG robots (unfortunately this happened some years ago). We wish to start immediately to show Hargreaves the benefit of information mining – they have a deadline for 2012-03-21 so we would like your agreement by 2012-03-15. All we require is:
YES: you may mine and publish factual information from the full text of NPG journals without additional payment and without restriction from legal and technical barriers.
I hope you can trust me to act responsibly on not violating copyright and being considerate to your robots. I have set out more details and a non-exhaustive illustration of facts in /pmr/2012/03/04/information-mining-and-hargreaves-i-set-out-the-absolute-rights-for-readers-non-negotiable .
Unfortunately any other reply than YES by 2012-03-15 will be regarded as unacceptable for the purposes of Hargreaves.
You will note that we are also approaching other major publishers of chemistry. Elsevier has already publicly said we can mine their content for research and we’ll be publishing the facts under an Open licence.