#animalgarden Bottom-up Ontologies in Physical Science

On Thursday (2013-04-11) I was invited by Fiona McNeill to give a 5-minute talk on ontologies at Edinburgh (http://dream.inf.ed.ac.uk/events/ukont-13/2013_workshop_program.html ). The workshop aims included:

Amongst other areas of interest, there will be a particular focus on creating and using open data. The program and audience is intentionally very diverse; the aim is to cover areas from many disciplines. We are particularly interested in bringing together those creating and developing the technology with those using the technology in industry, government and public organisations.

A short talk requires special preparation. No point in trying to prove theorems in first-order logic. In fact I argue that this is far too complicated and unnecessary for physical science. So #animalgarden offered to make a presentation. (They didn’t have time to have a proper shoot so they have re-used old slides and there’s no music yet). The slides are at http://www.slideshare.net/petermurrayrust/ontologies-in-physical-science – there are a few snapshots here. (Conventional chemists can read the words – which are deadly serious – and ignore the animals L )

The problem is that much of physical science doesn’t even use common identifiers or vocabularies. So the problems are people-problems, not technical ones.

There are a very few chemical ontologies but few people use them and this is even more problematic in materials science. This domain is probably the easiest of all sciences to create ontologies for but paradoxically it hasn’t happened. Crystallography (www.iucr.org/cif) is a shining exception but computational chemistry has nothing.

So a number of us are joining together to create “bottom-up ontologies”. Firstly small coherent group systematize the description of what they do in semantic form. Computational chemistry is particularly well suited to this – the programs (codes) have implicit semantics (because the code works and gives the right answers)! Then the community looks at the resultant collection of ontologies and systematizes them where they have the same concepts. In these cases there is a common entry in a communal ontology.

When this isn’t possible the ontologies create machine-readable conventions.

But few computational codes have explicit ontologies. Some define a few of the terms in their manuals, but they aren’t linked to the programs. We’ve developed Chemical Markup Language a which does exactly this. Each code (NWChem, Hyperchem, DLPOLY…) creates their own ontology using a common syntax (CML) but their own identifiers.

There are immediate benefits – the program output becomes semantic and can be re-used for analysis, aggregation, etc. If two groups have ontologies they compare notes and create a toplevel dictionary. As more groups join, the top-level dictionary gains more knowledge and acceptance from the community. And everyone has a feeling of ownership.

We are delighted that Hyperchem http://www.hyper.com/ have recently offered to join in the communal effort. See http://blogs.ch.cam.ac.uk/pmr/2011/11/02/searchable-semantic-compchem-data-quixote-chempound-fox-and-jumbo/ for an overview of the collaboration with PNNL. And http://blogs.ch.cam.ac.uk/pmr/2013/02/03/topics-and-links-for-my-talk-on-semantic-web-for-materials/ for work with CSIRO. And some idea of the great contribution from Kitware http://blogs.ch.cam.ac.uk/pmr/2013/03/01/liberation-software/

The slides are CC-BY. I need to add this.

This entry was posted in Uncategorized. Bookmark the permalink.

One Response to #animalgarden Bottom-up Ontologies in Physical Science

  1. Caroline says:

    Hi!
    This is a great initiative. I’m moving to Cambridge before the end of the year (I need to finish my thesis first), and if the project is still ongoing, I would be interested to meet up and see how this works, and to see whether I can chip in in some way.
    Best wishes,
    Caroline

Leave a Reply

Your email address will not be published. Required fields are marked *