I think the cheminformatics community is seeing the value of semantics in chemical editing, and understood that even closed-source product have shown serious evolution in this area. JChemPaint also followed the semantic path for a while, but does not have the advantage of tight integration in a production phase editing tool like Chem4Word has. With the current marketshare of Word, this editor will quickly see a quick uptake and bring semantic chemical editing to a new audience, that of organic chemists. This is positive, and anything drawn in this tool will be semantic and interoperate with other tools. That is positive too, even if many of us will not use the editor at all, like me.
I agree (although prediction of a “quick uptake” is an inexact science ). He is also right that he will not use the tool directly. However there are immediate spinoffs for the whole open chemistry community regardless of platform:
The system is modular. That means that it does not have to be used in Word (although obviously the benefits of creating a compound document will be absent). There is an essentially standalone tool allowing chemical manipulation of objects (relies on WPF/XAML and C#). There is also a library of routines (.NUMBO) which are independent of anything except the C# language. To what extent C# will be a help or a hindrance in the Open chemical world I don’t know.
The APIs have been designed to be largely platform and language independent. It’s difficult to write completely independent APIs (as for example CORBA IDLs) but the following signature is characteristic of the CID interface between the UI and the .NUMBO library:
public static bool CanFlipAboutExternalAcyclicBond(
The contextObject holds the complete state in CML so that a generic library (such as JUMBO) can relatively easily implement them. That means, inter alia, that the system can be used for batch processing of data without the need for graphics
Many of the components are declarative (in various flavours of XML) and hence language-independent. Thus the primary CML validation in import is done using a CML XML Schema and a Schematron validator. This means that the process could be trivially ported to any other language or platform simply through standard XML APIs.
XML is platform independent (you do not have to worry about line-endings, blank space, etc.)
The CML-Lite schema has been thoroughly refactored and fairly well tested so that we have a good proven foundation for semantic chemistry
And, above all, it will be Open. That means that the community will be able to contribute and benefit.
How can people benefit and contribute if they do not use Microsoft technology? To the extent that the chemical architecture is language-independent we should be able to develop and refine the chemical algorithms and semantics independently of C#. At present we are hotly debating what is meant by “add a positive charge to an atom” – which I hinted at before. Think about the effect (i.e. what is the formula and electron count) of the following:
add a “+” to the N in (CH3)N
add a “+” to CH4
add a “-” to CH4
add a “-” to N=O
add a “-” to C6H6 (benzene)
add a “-” to Na
add a “-” to Na+
add a “-” to B in BH3
add a “-” to F in HF
Now consider what would happen if you had the option “add a radical” (often denoted by “.”).
I doubt very much whether the chemistry community agrees completely on the results, other than that it probably contains a “-” and/or “+” and/or “.” glyph somewhere. But if we do not know how many electrons there are, or what the spin multiplicity is, we cannot submit this to a QM calculation.
For this reason I think the Open Chemistry community (and especially the Blue Obelisk community) can help systemat ize these declarative processes. My current position is that there are no universal valence rules and that there needs to be a separate set of rules for each element, each with its own special cases. I suspect that much of this is implicit, and perhaps explicit, in Openbabel, CDK, JUMBO, Avogadro and other Open software. If we can extract these into a set of rules that are declarative (i.e. not expressed in a specific procedural language) then we can start to get semantic consistency in our tools.
Here’s two more. What’s the result of deleting one =O atom from:
and are there any general rules?