Chemistry in MathML and CML – comments?

[warning – WordPress is not very math/chem friendly so forgive formatting]
Michael Kohlhase and I are trying to come up with a synthesis of MathML and CML for representing the numerical aspects fo chemistry. By chance we have started with reaction rates – mainly because I found a thesis which is well suited for markup. It contained the equation:
rgas = k0·[Ester]+kKat·[Ester·Kat]
(actually it contained it in PDF which didn’t transcribe but this is the essence.) So how do we encode it in MathML and CML.
At one level – presentational – its quite easy. MathML has symbols for all the symbols above and you simply pick them. They will allow pretty typesetting (which is important). The problem is that they don’t mean anything. What does “+” mean? it’s obvious to a chemist – we add two quantities. But to a mathematician it can mean lots of things. And, now you think of it, it also means several things to a chemist – such as a positive charge. Well it obviously doesn’t mean that here, does it? Could it be a positively charged Ester? Not really beacuse it’s not a superscript and because Esters aren’t usually charged and because additional make more sense. But these are chemical judgments. Chemists make them easily. Mathematicians might not.
Then there is the “·” – not a period/fullstop, but “middot” a midheight dot. What does it mean? Well it’s obvious to a mathematician that it could mean multiply. So we have three multiplications and we could use the MathML “times” construct. But hang on – Ester times kat doesn’t make chemical sense. Here is means “reaction complex” of Ester and Kat (I told you the thesis was in German and this is an abbreviation for Katalyst – catalyst in English). So the symbols by themselves are meaningless to a non-domain expert. And, unfortunately, our chemical journal-eating robots are not yet very expert in equations.
Take a minute to think about how your would explain the complete chemistry in this equation to a mathematician and how you would explain the complete mathematics to a chemist.
You’ve probably come up with something reasonable. But now try to explain it to a machine. That’s what we have to do in Content-MathML and CML.
Michael has come up with the following semantic maths expression (I hope WordPress preserves it)
(no it didn’t)
Try again…
<math class=”display”>
<csymbol cd=”foundations” name=”rgas”/>
<csymbol cd=”constants” name=”O”/>
<apply xml:id=”esterconst”>
<csymbol cd=”foundations” name=”squarebrackets”/>
<csymbol cd=”cml” name=”Ester”/>
<csymbol cd=”rateconstants” name=”Kat”/>
<csymbol cd=”foundations” name=”squarebrackets”/>
<csymbol cd=”cml” name=”Ester”/>
<csymbol cd;”cml” name=”middot”/>
<csymbol cd=”cml” name=”Katstar”/>
(this is about as pretty as it gets>
So this has captured the semantic of the maths, but none of the chemistry. It states (roughly) that you multiply something by something and add it to something times something.
The “cd” are OM content dictionaries – you can look up the meaning (and the semantics) of the object in a dictionary. So we could find out what rgas means in the foundations dictionary. Of course we still have to write the dictionary entry – and that isn’t easy – it’s the sort of thing that Andrew Walkingshaw has been developing Golem to help with. But we make progress.
The content MathML is a big advance – a machine could evaluate the expression if it know what the somethings were. That’s where chemistry comes in. And, be warned, if you want a machine to evaluate the chemistry in the above equation it may be harder than it looks. To start you off, here is Wikipedia’s version of the rate equation (and if you don’t agree, please update WP, that’s how it works)…

Formal definition of reaction rate

According to IUPAC‘s Gold Book definition[1] the reaction rate v (also r or R) for the general chemical reaction aA + bB → pP + qQ, occurring in a closed system under constant-volume conditions, without a build-up of reaction intermediates, is defined as:ccv

v = - \frac{1}{a} \frac{d[A]}{dt} = - \frac{1}{b} \frac{d[B]}{dt} = \frac{1}{p} \frac{d[P]}{dt} = \frac{1}{q} \frac{d[Q]}{dt}

The IUPAC[1] recommends that the unit of time should always be the second. In such a case the rate of reaction differs from the rate of increase of concentration of a product P by a constant factor (the reciprocal of its stoichiometric number) and for a reactant A by minus the reciprocal of the stoichiometric number. Reaction rate usually has the units of mol dm-3 s-1. It is important to bear in mind that the previous definition is only valid for a single reaction, in a closed system of constant volume.

First-order reactions

A first-order reaction depends on the concentration of only one reactant (a unimolecular reaction). Other reactants can be present, but each will be zero-order. The rate law for a first-order reaction is

\ r  = k[A]

k is the first order rate constant that has units of 1/time
If, and only if, this first-order reaction 1) occurs in a closed system, 2) there is no net build-up of intermediates and 3) there are no other reactions occurring, it can be shown by solving a mass balance for the system that

-\frac{1}{a}\frac{d[A]}{dt} = k[A]

where a is the stoichiometric coefficient of the species A.
The integrated first-order rate law is

\ \ln{[A]} = -akt + \ln{[A]_0}

That’s enough for me to post at present. Have you thought of everything? (I personally forgot the multiplier “a” in the last equation – it’s easy to do).

This entry was posted in chemistry, mkm2007, programming for scientists, XML. Bookmark the permalink.

2 Responses to Chemistry in MathML and CML – comments?

  1. Hannah Barjat says:

    Rate equations, as given above, are OK for text book examples but for others the rate expression may be relevant for certain conditions only e.g. over a temperature range or pressure range. I know that content mathML has ways of dealing with conditions and with domains of application. However, without real world examples to follow I’m unsure how to apply these.

  2. pm286 says:

    (1)Thanks Hannah. Real-world examples in legacy form or text are usually the best way to start. So if you have some examples of what you would like to do that would be great. And an indication of how general or consistent the requirement is. It may be that’s it’s straightforward to turn it into CML, or there may have to be some new language features.

Leave a Reply

Your email address will not be published. Required fields are marked *