Joerg Wegner posts (Blogging chemistry means not blogging minable data) :
As posted by Peter more and more chemists are blogging. And I would appreciate if those blogs would contain more chemical minable information. I think especially Rich and Egon have given some nice examples on their blogs. And beside of those blogs Wiki’s can also use chemical information (and not images!).
Anyway, those blogs are just great:
And here are some ideas what is missing on those blogs:
- tags based on reaction names (with cross-links), reagents, products, …
- chemical data in a native and downloadable/minable form, e.g. CML or InChI. Preferably all those data entries should have unique identifiers and tagged reaction centers! So, I prefer here rather CML than InChI.
- more literature references using DOI or PMID
I obviously and absolutely agree. The question is what to do about it. I have been struggling with simple code on this blog (I have now cracked it) but even loading images isn’t absolutely trivial (you have to upload from a directory and then upload int the blog, whereas we would really like to cut and paste.
The commercial chemical tools are not much help. Apart from costing money they are designed to be integrated into Word documents using (I think) ActiveX. Moreover they aren’t designed to be semantic. In the Blue Obelisk we are developing tools which are XML-CML aware and which – ultimately – will be the simple solution to this. The main challenges are:
- discover, develop and enhance semantic wikis and blogs. Please post anything you know that actually works.
- find simple (almost automatic) ways of embedding InChIs and CML invisibly in text. I think this would be relatively simple if we can agree on a method.
- enhancing the Blue Obelisk tools (especially CDK, JOELib and JChempaint) to provide simple chemical services for Wiki/blogs
I came up with a novel way of doing this – what do you think? Since everything on blogs is public, we could use a communal authoring service. Let’s say we have a server – and this is something I’ll put to my colleagues – that provides a graphical authoring interface for semantic chemistry. (This would include reactions as well as compounds). You would create the molecules you want (and we’ve got some simple ideas for accelrated graphics) on the server. It would create all the InChI and CML transparently. It would give you two URLs:
- An image of the chemical object (like the GIF we have at present). You could either link to this or cut and paste it or both.
- A link to semantic chemistry stored on the server. Since all our work is public and I think most of us use Creative Commons there is no loss of IPR – quite the reverse. Clicking on the link could bring up Jmol, JChempaint or whatever, without any need to add client-side functionality
Note that if the server is unavailable you would still have the local PNG in the normal way. There are many advantages:
- there would be a communal repository for any molecule. This means that simply by linking to (say) ciclosporin it would give you a template (or many templates) which the community had already drawn.
- The molecule could be linked to other sources such as Wikipedia, Pubchem, etc. Conversely they could link to this resource. We build a communal knowledge base.
- The server can provide services (e.g. logP from CDK or JOELib).
- The server would have search facilities (CDK, JOELib, OpenBabel)
and we can all think of many more.
If this excites you – it may need altering – let us know.