I was delighted to meet old friends from the MathML/OpenMath community last week at Mathematical Knowledge Management 2007 – Patrick Ion, Robert Miner, James Davenport and Michael Kohlhase (apologies to any I have omitted). OpenMath (1993) was one of the first non-textual markup languages and was based on SGML, while MathML came along later (1999). The languages are distiinct but deliberately converging and (from WP):
OpenMath consists of the definition of “OpenMath Objects”, an abstract datatype for describing the logical structure of a mathematical formula, and the definition of “OpenMath Content Dictionaries”, or collections of names for mathematical concepts. The names available from the latter type of collections are specifically intended for use in extending MathML, and conversely, a basic set of such “Content Dictionaries” has been designed to be compatible with the small set of mathematical concepts defined in Content MathML, the non-presentational subset of MathML.
so I shall tend to use them interchangeably. Note, however, that MathML is an activity of the W3C, while (WP)
OpenMath has been developed in a long series of workshops and (mostly European) research projects that began in 1993 and continues through today.
MathML and CML have had a long history of association. We tend to present on the same platforms (e.g. NSF / NSDL Workshop on Scientific Markup Languages). Each has its particular growth points – they are accepted as formal means of scholarly publication by several major publishers and there are a variety of toolkits.
Here I want to emphasize that each is required not just in its own domain, but by neighbouring ones. Thus chemistry needs MathML, geology needs CML, etc. This requires a different mindset when developing tools – it isn’t necessary to address all the cutting edge research in the mother subject – but important to make sure that you can solve a useful number of problems in everyday science and engineering.
As an example I asked the maths community whether I could search for a given differential equation, e.g.:
dx/dt = -k*x
You can, of course, type this directly into Google and get results like this but that only works when the variables are x and t. Thus
da/dt = -ka … or …
da/a +kdt = 0
or many other forms represent the same abstract equation.
So I was delighted to find that several people were actively working on this – it means we can serach the world’s literature for given functional forms indepedently of how they are represented. It’s hard – in some cases very hard – and varies between countries. It’s similar to the chemist’s use of InChI (see Unofficial InChI FAQ) to normalize and canonicalize chemical structure (it doesn’t matter whether you write HCl or ClH – the InChI is the same). And Google is quite good at finding these forms.
Even more fundamental is the use of dictionaries – OM has the content dictionaries and CML has CMLDict/Golem. They aren’t identical but close enough that it’s easy to convert between them. The dictionary concept is very powerful and allows languages to be extended almost indefinitely. It also allows different groups to develop their own systems – which may even be incompatible – you load in the appropriate dictionary. And the software is effectively written.
So there is now a strong bond between the MathML and CML community. They are starting to adopt the idea of blogging and social computing (chemistry has led the way here), while we shall adopt some of the formalities of OM in our representations of physical science.
We’re going to pursue the following (at least) and keep in touch through the blogosphere:
- mixed mathematics and chemistry (see next post)
- social computing, which could involve student projects, etc.
- combining forces in the advocacy of markup languages in scholarly scientific publications and the communal dissemination of data.
So – to show this isn’t just talk, MichaelK and I are starting to see how a “simple” formula in physical chemistry can be represented. We’ll show you shortly