Hargreaves and Information mining: I ask the American Chemical Society for freedom to mine factual data

Here is a letter I have sent to Madeleine Jacobs, CEO of the American Chemistry Society (ACS) and former director of publications. In it I ask for freedom to extract factual information from the full-text of ACS journals.

Henry Rzepa and I are joint recipients of a prestigious ACS award and are organizing a symposium in the Fall. We hope to be able to show what we have managed to do with extraction of factual data from full-text. Here I ask Madeleine for assurance we can do this without barriers from ACS.


Dear Madeleine,

Unfortunately we’ve not yet been able to meet though our paths have crossed for several years. (I have copied in Dave Martinsen in ACS Publications whom I have known for 20 years).

You’ll know that I am this year’s recipient (joint with Henry Rzepa) of the Society’s CINF Division Herman Skolnik award. Part of the award is for our work in machine extraction of semantic chemical information (in Chemical Markup Language, CML) and re-use for new scientific opportunities. As a Skolnik medallist Henry and I are organizing part of this year’s Fall CINF meeting and shall be demonstrating some of our achievements. In particular we wish to show the great opportunities that semantic chemistry gives and particularly the ability to use the factual information in the primary literature.

We are now in the position where we can extract factual chemical information from the full text of articles with high precision and recall. For example Our OPSIN name-to-structure tool (published last year in the Society’s J.Chem. Inf. Model [1] and highly accessed)  has accuracy is > 99.5% and recall > 95%. The University of Cambridge is a subscriber to ACS journals and we would like to begin to extract information on a systematic basis for Open scientific research. We don’t need technical help or permission from the ACS. We have copied Cambridge University Library staff.

This mail is to ask your assurance that we can do this without (a) legal/contractual barriers from ACS and (b) that we shall not be cut off by ACS robots (unfortunately this happened some years ago even though we hadn’t violated anything). We wish to start immediately to show Hargreaves the benefit of information mining – they have a deadline for 2012-03-21 so we would like your agreement by 2012-03-15. All we require is:

YES: you may mine and publish factual information from ACS journals without additional payment and without restriction from legal and technical barriers.

I hope you can trust me to act responsibly on not violating copyright and being considerate to your robots. I have set out more details and a non-exhaustive illustration of facts in /pmr/2012/03/04/information-mining-and-hargreaves-i-set-out-the-absolute-rights-for-readers-non-negotiable .

Unfortunately any other reply than YES by 2012-03-15 will be regarded as unacceptable for the purposes of Hargreaves.

You will note that we are also approaching other major publishers of chemistry. Alicia Wise, Director of Universal Access at Elsevier, has already publicly said we can mine their content for research and we’ll be publishing their factual data under an Open licence. As a result we should have a great opportunity to show the power of the semantic approach at the Fall Symposium.

And, of course, I would be delighted to meet you there!

Best wishes,


[1] http://pubs.acs.org/doi/abs/10.1021/ci100384d?journalCode=jcisd8

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *