petermr's blog

A Scientist and the Web

 

Conspiracy and chemistry and an invitation to lunch

Antony Williams (Chemspider) and Stuart Cantrill (Nature) have recently blogged about what the blogosphere is seeing as censorship on the Web by the American Chemical Society. This is a bold and serious claim and needs some background. The facts,  from Stuart:

One story that caught our eye was from Outsell, which, in their words, ‘is the only research and advisory firm focused on the publishing, information, and education industries’. The article was entitled ‘Chemical Bonding InChI by InChI’ and it offered an analysis of how certain publishers were making use of InChIs – those of you unfamiliar with InChIs can go here for a primer.

Daniel Pollock at Outsell had published an article on March 30th 2009 entitled “Chemical Bonding InChI by InChI”. He discussed the InChI Resolver and the efforts to raise enthusiasm for the InChI. He also discussed the efforts of both Nature Publishing Group and the Royal Society of Chemistry to proliferate the use of InChIs.
[...]

The article them moves on to consider whether CAS (Chemical Abstracts Service), which is owned by the ACS, will also embrace InChIs. The conclusion was that we may have to wait a while for that to happen.

So why do you need to know this? Well, the story from Outsell has been withdrawn (on April 8th) – and more than that, in fact, it has been removed from their archives (although the original story is cached on Google and you can find it here).

Whether it is right to completely remove every trace of a story that you withdraw is a discussion for another day – but now all that remains is a brief notice indicating that the original story did not hold up to Outsell’s internal standards.

Outsell now say that the original article wasn’t balanced and that the ‘tone of the piece could be taken to single out CAS as being late in responding to the trends’. Surely readers could make that judgement for themselves?

The great shame is that the whole article has simply been removed and an analysis of how cross-publisher development on an important topic such as the InChI – which may have a significant role to play in chemistry publishing – has been lost.

Antony uses stronger language and speculation  (Conspiracy Theories and InChIs – Why was the Article Removed? – there’s much more that is worth reading) :

Conspiracy theories are already moving around the community. The majority of people I have discussed this with believe that the retraction was likely forced by CAS

PMR: In short the best guess is that CAS see InChIs as a threat (I’ll discuss the foolishness of this below) and that they put pressure on Outsell to retract. I don’t know under what auspices Daniel writes for Outsell – employee, invited contributor – etc. but Outsell have the right to moderate what is published on their site. They may feel that Daniel’s article detracted from their brand; I take the opposite view – that the article was well written and that the retraction has done Outsell damage. (Contrast a foward-looking company like Talis whose Panlibus blog written by employees is a major enhancement). It emphasizes the problem of employees publishing under their company name, and I have empathy for Daniel.

The retraction seems to be typical of the knee-jerk reaction that CAS applies to anything that could conceivably be seen as challenging their monopoly. For example last year Wikipedia volunteers started checking CAS numbers for correctness and the first reaction was to tell them that they were in breach of contract. Not “we are glad to see quality applied to chemistry”; “glad to see responsible use of unique identifiers”. After the natural blogosphere outrage (including this blog) CAS relented. I doubt they will relent this time.

It’s difficult to know what the reality is – but there are too many stories about clandestine and lobbying practices at ACS to ignore them. PRISM, the ACS mole, the constant lobbying, etc. ACS frequently resort to legal action (e.g. against Google) and I suspect there was a phone call from lawyers. We’ll probably never know.

Does this protect CAS’s monopoly? No, quite the reverse. It makes them look foolish, out of touch, and ultra-monopolistic. They have a huge turnover, and a monopoly of complete chemical information so they are immune from competition, right? Wrong.

Here is the UK Guardian newspaper recounting the demise of Encyclopedia Britannica (which I estimate has similar turnover to CAS):

By 1990 sales revenues had reached $650m.

Yet within five years, EB underwent a near-death experience. What almost killed it was a product that most of its executives regarded as a joke, an encyclopedia on CD-Rom launched by Microsoft and called Encarta. The original content was licensed from an outfit with the Dickensian name of Funk and Wagnalls, and some of it gave trash a bad name. So Microsoft spruced it up, added multimedia content and made it easy to use. To the astonishment of EB’s board, this meretricious object triggered a precipitous decline in sales of their gold-standard product.

Faced with catastrophe, the Benton foundation put EB up for sale. It took 18 months to find a buyer, a Swiss billionaire named Jacob Safra who bought the company for half its book value. The story of Britannica is now a business-school case study in how rapidly competitors can emerge – apparently from nowhere – in a digital world. The First Rule of Business nowadays is that somewhere out there someone (and not just Google) is incubating a business plan that is based on eating your lunch

So where are the lunch-eaters coming from? Surely we cannot recreate a database of 30 million compounds? No, we can’t – we can create something much better – Linked Open Chemistry. It won’t come from a single source but from all the Open chemistry efforts that have grown over the last 1-3 years. They include Chemspider, Pubchem, ChEBI, Wikipedia, The Blue Obelisk, CrystalEye, Open Noteboook Science and a number of others. None have found the ACS as a body receptive to the new wave of chemistry. They are bringing to the lunch table:

  • Openness
  • Re-use and sharing
  • Immediacy
  • Innovation
  • Linkedness
  • Semantics
  • Quality Control
  • Community

Not all of these are fully developed but they are part of the Linked Data of the future and they can grow quickly. CAS’s actions and perceived stance is uniting them in a common effort to make chemical data free. Antony and I are meeting and the end of this month and we’ll be seeing how our offerings fit together. Yes, we’ve had differences, but these have helped to re-orient both of us and we now have at least a common goal of liberating chemistry.

There are some simple approaches which can revolutionize the way chemistry is captured and aggregated. Our own approach is semantic publishing (e.g. Chem4Word) means that the tacky business of text-and image mining could disappear. Yes, it needs a culture change in chemistry, but that is looking likelier all the time. Meanwhile, high prices, restrictive practices will only serve to drive more people (including those outside chemistry) to create Linked Open Chemical Data.

After all it’s OUR DATA.

Leave a Reply