Chemical Microformats have arrived some time ago!

Egon writes and puts me to shame...

  1. Name: Egon Willighagen | E-mail: egon.willighagen@gmail.com | URI: http://chem-bla-ics.blogspot.com/ | IP: 134.95.200.25The use of microformats in chemistry has already begun:http://chem-bla-ics.blogspot.com/2006/12/including-smiles-cml-and-inchi-in.html

He suggested this over FOUR MONTHS ago! I probably missed it as I teleported out of the blogosphere at that time for 3 months to do some hacking. So sorry!

Anyway this is great. He writes:

Including SMILES, CAS and InChI in blogs

Including SMILES is much easier as it is plain text, and has the advantage over InChI that it is much more readable. Chris wondered in th e KinasePro blog on how to tag SMILES, while Paul did the same on ChemBark about CAS numbers.

Now, users of PostGenomic.com know how to add markup to their blogs to get PostGenomic index discussed literature, website and conferences. Something similar is easily done for chemistry things too, as I showed in Hacking InChI support into postgenomic.com (which was put on lower priority because of finishing my PhD). PostGenomic.com basically uses microformats, which I blogged about just a few days ago in Chemo::Blogs #2, where I suggested the use of asperin.

And this is the way SMILES, CAS and InChI's can be tagged on blogs. The element is HTML code to indicate a bit of similar content in HTML, and can, among many other things, be formatted differently than other text. However, this can also be used to add semantics in a relatively cheap, but accepted, way. Microformats are formalized just by use, so whatever we, as chemistry bloggers, use will become the de facto standard. Here are my suggestions:

  • for SMILES: CCO
  • for CAS registry numbers: 50-00-0
  • for InChI: InChI=1/CH4/h1H4

The RDFa alternative

The future, however, might use RDFa over microformats, so here are the RDFa equivalents:

  • for SMILES: CCO
  • for CAS registry numbers: 50-00-0
  • for InChI: InChI=1/CH4/h1H4

which requires you to register the namespace xmlns:chem="http://www.blueobelisk.org/chemistryblogs/" somewhere though. Formally, the URN for this namespace needs to be formalized; Peter, would the Blue Obelisk be the platform to do this? BTW, this is more advanced, and currently does not have practical advantages over the use of microformats.

Talking with Dan Connolly it seems that for best use of Microformats we need to regularize the vocabulary - see the FOAF specification for example. So - unless it has already been done and I have been sleeping - we should get this going on the BO Wiki.

This entry was posted in chemistry, semanticWeb. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>