Information-mining: Discussions with Wiley

I’ve now heard from Duncan (sic) Campbell in Wiley [I include his email because he is currently the contact point for information mining. I include our recent correspondence (Duncan in italics)

On Fri, Mar 9, 2012 at 4:16 PM, Campbell, Duncan – Oxford <dcampbell@wiley.com> wrote:

Peter

 

I’ve copied in my colleagues in the Cambridge University Library

Thanks for getting in touch. We would be happy to discuss your specific requirements for text-mining Wiley content, and how we can work with you to enable mining in a mutually-acceptable manner.


Excellent. You’ll appreciate that this is a matter of great public interest at present and an opportunity to show how helpful publishers are, so I’ll be posting the correspondence on my blog.

I don’t have specific requirements. I have the technology to extract facts from Wiley publications and do scientific research on them and I’d like to do that. In the first instance I’ll analyze which journals contain chemistry and extract all the chemical facts and then do research on them. Since the data are factual there is no question of copyright being violated.
As our group is the leading creator of Open Source information-mining software for chemistry and we are regarded as among the world’s experts I have a large number of collaborators. There are a large number of projects already but we add at least one a week so there’s no point in burdening you with the details. Here are just 5 to show you the power.

  • scanning the literature for potential antimalarial compounds (Mat Todd). We have to search for every compound as there is no golden rule for finding drugs against this killer disease
  • finding second harmonic generators for solar panels, leading to increased energy efficiency and greenness for the planet
  • Computing the human metabolome. Again we have to find all instances where compounds have been mentioned that might be human metabolites
  • Improving the eco-friendliness of chemical reactions. What solvents have been used in what reactions? Can we use solvents that are more friendly to the planet. Again we need to look at every reaction.
  • Improving the accuracy of computaional chemistry. There are billions of dollars spent on trying to predict the structure of matter. We want to find every paper and find the most cost effective methods

There are also many added benefits in scientific information-mining research itself where I am an acknowledged world expert (sorry to sound boastful, it’s just to assure you I know what I’m doing).

I’m not asking you to get involved in any of the technical details and we don’t need any special technology from the publisher, any special versions of the articles or any APIs. There is no need to involve CUL in details. All we need is:

  1. To download and analyze, using machines, papers from Wiley journals to which we have subscriptions (we use web-friendly crawling protocols)
  2. An assurance from Wiley that you will not impose technical and legal/contractual barriers.
  3. To be able to publish the data on which the science is based (science without data is almost worthless as you know)

We give you an assurance that we shan’t deliberately publish any copyright material such as the complete verbatim Version of Record.

 
 

We are keen to enhance the usage of our journal content by encouraging text and data mining, and welcome the opportunity to work on a specific project with you that would enable us to gain further experience in this area.  As you’ll appreciate, at this stage there are still questions around access, processing and distribution of the outputs of text mining, which Wiley, in common with most other STM publishers, is working through.

     I look forward to hearing from you further.

 There is an urgency. We are keen to start some of these projects within a day or two as we want to present to the Hargreaves enquiry how valuable text-mining can be. We therefore only need from you an assurance that we can employ factual mining and to get into the report we’ll need this by 2012-03-14. I am afraid promises of intent are worthless at this stage. There is only one acceptable answer:

YES – you can go ahead without further permission from Wiley

anything else, I’m afraid will be a NO for Hargreaves.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *