In One Day I’ll Have Lunch with Egon Willighagen Too… Chemspiderman wrote
“So what can we do now to help making connections between papers and molecules? Peter Corbett, who works with Peter Murray Rust, is working on automated methods of getting computers to read chemistry papers and output semantic markup of them. “
AW> Over at ChemSpider we are working with Will Griffiths who developed ChemRefer . We have already extracted 10s of thousands of chemical names and will be linking them up to ChemSpider structures to enable Open Access papers to be structure/substructure searchable. However, we’ve hit a bit of a hurdle…more details on this will follow shortly but we have been asked to remove thousands of articles indexed according to what we believe is a standard search engine policy from the ChemRefer index. During our conversation today with the publisher the conversion of chemical names to chemical structures to provide a structure searchable index of the articles was deemed to be “re-purposing” of the Open Access articles and is NOT allowable. Peter Corbett and Peter Murray Rust are engaged in similar activities so will likely run into the same challenges. If they manage to get around this issue with this and other publishers then they will be working in a “permissive” role where they will need to get permission from publishers to perform semantic markup. Their semantic markup is also “re-purposing”. The “permissive challenge” is far away from Peter’s stance in terms of Open Data for all.
By “open access” to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.
Peter, I will not announce the publisher at present because I made a commitment to not do so until we had a mutually agreeable blog posting for our users and accurately representing the conversation and agreements between us. I have an urge to co-exist in the world with publishers since they put a lot of value into the world. With the changes going on in Open Access figuring out how to co-exist is very necessary. I hope we can get the information out shortly. It is possible we have mis-stepped but more likely that there is a policy issue with spidering policy that needs addressing by the publisher.