Datuments and the ACS Style Guide

I was delighted to receive a special book yesterday:
“The ACS Style Guide”
Effective Communication of Scientific Information
 Anne Coghill and Lorrin Garson.
OUP ISBN-13:978-0-8412-3999-9
It’s an attractive produced hardback volume and I’m torn as to whether I should keep it as pristine as possible or cover it with annotations. I think I’ll do the latter!
The editors did me and Henry Rzepa the honour of contributing a chapter on Markup Languages, whih we have called:
“Markup Languages and the Datument”.
In the Foreword Madeleine Jacobs, Executive Director/CEO of ACS writes:
“I fell in love with chemistry when I was 13. I fell in love with writing at the age of four…” and
“The goal of the [guide] is to help authors and editors achieve […] ease and grace in all of their communications”
So the editors asked Henry and me to look ahead and write about style in an environment that is still building itself. Obviously we shall be out of date in some respects very soon, but we have tried to anticipate the closer linkage of machines and humans in science – epitomised by Tim Berners-Lee’s Semantic Web. The scientific publication of the future will soon be very different from what we do now – the younger generation may soon not use pen and paper and expect instant multichannel information. Science has to react.
So as a first step Henry and I have coined the term “datument” [1] – a portmanteau of “document” and “data”. This is a single compound (or hyper-) document representing the complete experimental and scientific environment of the researcher or scholar. The first steps are to integrate multiple markup languages (e.g. MathML, XHTML, SVG, + CML, AnIML and ThermoML in chemistry). Each language has an intelligent browser or other user agent which can understand the appropriate part of the document. And this is not just creating something that is visual – an equation might say “integrate me” – a molecule might say “I can give you my molecular weight and you can calculate my logP”. When we have rich clients such as Bioclipse (more later) we shall be able to let our machines read the boring bits of the paper while concentrating on the more complex results. Already our group can read a datument and send it off to calculate additional properties of the molecules. This takes a few minutes so the human can read the text while the machine enhances the data.
The previous style guide was published in 1997 and our contribution will look very strange in 2015! I hope that some of the ideas still make sense in that brave future. I gently predict that the Style Guide then will look very different from the book today. But I shall still need to be able to “write on it”!
I’ve been invited to the ACS on Thursday next week and hope to be able to meet some of the other authors. I’ll be taking the guide as my reading on the plane.
[1] This works in IE. It used to work in Firefox. The upgrades have broken it. Since the datument is on the publishers’ site there isn’t much we can do (though perhaps we should take a copy and mend it ourself). It is so frustrating to have to fight the browsers every few months…

