CML – semantic representation of molecular structure

I have been asked by Rich Apodaca to show how the various styles of representing ferrocene are possible within CML. Let me stress that these are different connection tables which the community variously uses to represent a single compound. There is no way that a connection table, per se, can indicate that there are alternative ways of representing the same information. At one level it’s like expecting the equation

x = 1;

to indicate that it’s semantically equivalent to

x - 1 = 0;

which requires normalization and ontology.

So here is how we represent two more (valid) representations of ferrocene. The first is effectively cp-Fe-cp where single bonds are used to link the iron to particular atoms of the cp. We’ll remove the sub-molecule structure and add bonds…

<molecule id="mol123456789" title="ferrocene" xmlns='http://www.xml-cml.org/schema'>
  <formula concise="C 10 H 10 Fe 1" inline="Fe(C_5_H_5)_2_"/>
    <atomArray>
      <atom id="a0" elementType="Fe"/>
      <atom id="a1" elementType="C"/>
      <atom id="a2" elementType="C"/>
      <atom id="a3" elementType="C"/>
      <atom id="a4" elementType="C"/>
      <atom id="a5" elementType="C"/>
      <atom id="a6" elementType="H"/>
      <atom id="a7" elementType="H"/>
      <atom id="a8" elementType="H"/>
      <atom id="a9" elementType="H"/>
      <atom id="a10" elementType="H"/>
      <atom id="a11" elementType="C"/>
      <atom id="a12" elementType="C"/>
      <atom id="a13" elementType="C"/>
      <atom id="a14" elementType="C"/>
      <atom id="a15" elementType="C"/>
      <atom id="a16" elementType="H"/>
      <atom id="a17" elementType="H"/>
      <atom id="a18" elementType="H"/>
      <atom id="a19" elementType="H"/>
      <atom id="a20" elementType="H"/>
    </atomArray>
    <bondArray>
      <bond id="a1_a2" atomRefs2="a1 a2"/>
      <bond id="a2_a3" atomRefs2="a2 a3" order="D"/>
      <bond id="a3_a4" atomRefs2="a3 a4"/>
      <bond id="a4_a5" atomRefs2="a4 a5" order="D"/>
      <bond id="a5_a1" atomRefs2="a5 a1"/>
      <bond id="a1_a6" atomRefs2="a1 a6"/>
      <bond id="a2_a7" atomRefs2="a2 a7"/>
      <bond id="a3_a8" atomRefs2="a3 a8"/>
      <bond id="a4_a9" atomRefs2="a4 a9"/>
      <bond id="a5_a10" atomRefs2="a5 a10"/>
      <bond id="a11_a12" atomRefs2="a11 a12"/>
      <bond id="a12_a13" atomRefs2="a12 a13"/>
      <bond id="a13_a14" atomRefs2="a13 a14"/>
      <bond id="a14_a15" atomRefs2="a14 a15"/>
      <bond id="a15_a11" atomRefs2="a15 a11"/>
      <bond id="a11_a16" atomRefs2="a11 a16"/>
      <bond id="a12_a17" atomRefs2="a12 a17"/>
      <bond id="a13_a18" atomRefs2="a13 a18"/>
      <bond id="a14_a19" atomRefs2="a14 a19"/>
      <bond id="a15_a20" atomRefs2="a15 a20"/>
      <bond id="a0_a1" atomRefs2="a0 a1"/>
      <bond id="a0_a6" atomRefs2="a0 a6"/>
    </bondArray>
</molecule>

That’s fairly straightforward and here I have added some bond orders. I don’t terribly like doing this as it’s a rather meaningless retrofitting, especially when H atoms are explicit.

Here’s the approach using explicit bonds from Fe to all carbons (sketch (a)).

<molecule id="mol123456789" title="ferrocene" xmlns='http://www.xml-cml.org/schema'>
  <formula concise="C 10 H 10 Fe 1" inline="Fe(C_5_H_5)_2_"/>
    <atomArray>
      <atom id="a0" elementType="Fe"/>
      <atom id="a1" elementType="C"/>
      <atom id="a2" elementType="C"/>
      <atom id="a3" elementType="C"/>
      <atom id="a4" elementType="C"/>
      <atom id="a5" elementType="C"/>
      <atom id="a6" elementType="H"/>
      <atom id="a7" elementType="H"/>
      <atom id="a8" elementType="H"/>
      <atom id="a9" elementType="H"/>
      <atom id="a10" elementType="H"/>
      <atom id="a11" elementType="C"/>
      <atom id="a12" elementType="C"/>
      <atom id="a13" elementType="C"/>
      <atom id="a14" elementType="C"/>
      <atom id="a15" elementType="C"/>
      <atom id="a16" elementType="H"/>
      <atom id="a17" elementType="H"/>
      <atom id="a18" elementType="H"/>
      <atom id="a19" elementType="H"/>
      <atom id="a20" elementType="H"/>
    </atomArray>
    <bondArray>
      <bond id="a1_a2" atomRefs2="a1 a2"/>
      <bond id="a2_a3" atomRefs2="a2 a3"/>
      <bond id="a3_a4" atomRefs2="a3 a4"/>
      <bond id="a4_a5" atomRefs2="a4 a5"/>
      <bond id="a5_a1" atomRefs2="a5 a1"/>
      <bond id="a1_a6" atomRefs2="a1 a6"/>
      <bond id="a2_a7" atomRefs2="a2 a7"/>
      <bond id="a3_a8" atomRefs2="a3 a8"/>
      <bond id="a4_a9" atomRefs2="a4 a9"/>
      <bond id="a5_a10" atomRefs2="a5 a10"/>
      <bond id="a11_a12" atomRefs2="a11 a12"/>
      <bond id="a12_a13" atomRefs2="a12 a13"/>
      <bond id="a13_a14" atomRefs2="a13 a14"/>
      <bond id="a14_a15" atomRefs2="a14 a15"/>
      <bond id="a15_a11" atomRefs2="a15 a11"/>
      <bond id="a11_a16" atomRefs2="a11 a16"/>
      <bond id="a12_a17" atomRefs2="a12 a17"/>
      <bond id="a13_a18" atomRefs2="a13 a18"/>
      <bond id="a14_a19" atomRefs2="a14 a19"/>
      <bond id="a15_a20" atomRefs2="a15 a20"/>
      <bond id="a0_a1" atomRefs2="a0 a1"/>
      <bond id="a0_a2" atomRefs2="a0 a2"/>
      <bond id="a0_a3" atomRefs2="a0 a3"/>
      <bond id="a0_a4" atomRefs2="a0 a4"/>
      <bond id="a0_a5" atomRefs2="a0 a5"/>
      <bond id="a0_a6" atomRefs2="a0 a6"/>
      <bond id="a0_a7" atomRefs2="a0 a7"/>
      <bond id="a0_a8" atomRefs2="a0 a8"/>
      <bond id="a0_a9" atomRefs2="a0 a9"/>
      <bond id="a0_a10" atomRefs2="a0 a10"/>
    </bondArray>
</molecule>

The bonds are not, of course, 2-electron bonds – but we haven’t said they are – that’s the strength of the semantic approach. If we really wanted to indicate that each bond had 4 / 5 electrons, CML would allow us to do it – see next example.

This entry was posted in "virtual communities", Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *