Why we need chemistry ontologies

Mat Todd is an example of the new generation of organic chemists who is concerned about the broader picture of information. Here’s a recent comment, which I address:

Mat Todd says:

Peter, I think trying to pin down the exact nature of a substance and label it is important. I suspect it’s important because we need computers to be able to handle the data. But it reminds me of efforts to label vague concepts with names more generally. What is ‘British?’ How many hairs must I lose before I’m bald? At what wavelength does red become orange? To decide that such labels are important is half the battle.

Beyond the zeolite/clay examples above, there was an interesting episode in a recent synthesis of quinine from Robert Williams (10.1002/anie.200705421). To quote another site (http://tinyurl.com/dg8me8):

“following the old ways without the benefits of modern storage methods of reactive metals may have been critical in their success. Initially, their yield of quinine was very low. They suspected that the aluminium powder used as a reducing agent in the last step was the problem. It was too fresh! Leaving it in air for a short period leads to the formation of a coating of aluminium oxide. When the experiment was repeated with this powder, the yield matched that reported by Woodward and Doering.”

Even commercial reagents with the same labels can be a mixture of things in a time-dependent manner.

PMR: Exactly so.

And this is where ontologies come in. I’ve been keen on ontologies for over 10 years (and am credited with having used the term “ontological warfare”). I have been sceptical of using Upper Ontologies as I thought they would be too general and too incompatible. However Nico Adams has done a fantastic job of creating a broad and deep ontology (ChemAxiom) for mainstream chemistry, based on an upper Ontology (BFO). I am deliberately not saying more here, as it’s his shout.

However I can say that Nico addresses these problems – the changing nature of entities and concepts. The upper ontology is necessarily abstract and uses terms such as “Continuants” and “Occurrents” which can be addressed to the decaying aluminium above. I shall keep emphasizing that the reconciliation between variant names, structures, samples and substances can only be properly made through ontologies.

My own part is to create lower-level ontologies that harmonize with Nico’s. I’ve converted the CIF dictionary into an ontology which now acts to validate data instances, and created computational ontologies such as for Gaussian (in our COST program). It’s clear that their time and their technology has arrived.

This entry was posted in "virtual communities", Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *