Tagging molecules

Another of last night’s ideas that I have been beaten to – that’s the power of the blogosphere! Nick Day and I are trying to find a social mechanism for commenting on data – specifically his CrystalEye collection of 60 000 crystal structures. And can we do this without writing any software. So I’ll first propose my idea, then comment on Egon’s.
The problem is to annotate molecules on someone else’s site. We don’t care about being Open – in fact we want everything to be public. So my model is to use a public tagging system like del.icio.us – the only barrier is that to use it you have to register with del.icio.us. But most young people have already done that months ago. So when you find a molecule in Nick’s knowledge base that you want to comment on you simply tag its HTML page. It’s probably useful to have a small tag vocabulary like “serious error”, “wrong space group” and then comment in free text. The comments would accumulate on del.icio.us pages and Nick’s robot could scrape them from time to time and add them to the relevant entries.
In practice I would like to use Connotea as it’s been developed specifically for scientists, because it’s Open Source and most importantly we have had a great collaboration with the folks at Nature Publishing Group (New technologies). I don’t know whether Connotea has been used for data tagging – if not we might have ideas that we’d like to see incorporated. I’m sure Timo, Ben, Tony will pick up this post!
The only downside is that we have to scrape the Connotea site at regular intervals.
Egon has a different model…

Cb comments for InChI’s

About a year ago Pedro wrote a Greasemonkey script to add comments from PostGenomic.com to table of contents of scientific journals. Noel extended it with support for Chemical blogspace (see also this earlier item). Now, the later website is maintained by me, and I extended the aggregator software with molecule support, for example to show hot molecules on the frontpage (at some point my patches will be backported into mainstream. Euan, why not invite me to London HQ in, say, June?).So, when we can show comments from blogosphere for journal articles, why can’t we do that for molecules too? Sure we can. Just needs some hacking. Right, and done that today. The scripts works for PubChem:

Works for any element with an URL to PubChem like http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=pccompound&term=%22InChI=1/CH4/h1H4%22[InChI]. BTW, while the URL is not very readable, this might actually be a good way to hide InChI‘s, though I am sure Google will not index this InChI either.
And it also works for semantically marked up InChI’s (using either microformats or RDFa):

The downside of this is that users have to install the greasemonkey to use this, but that’s not too much of a problem since I predict that soon Blue Obelisk Greasemonkey will be widely distributed. The data have to be accumulated somewhere, and that requires maintenance. The Greasemonkey group have already done that – at Koeln, I think.
Perhaps they could be combined? The greasemonkey could save the comments at Connotea. That’s an almost maintenance-free system. It would also start to create an implicit mashup – data from different sources would be linked through the Connotea site.

This entry was posted in chemistry, semanticWeb. Bookmark the permalink.

3 Responses to Tagging molecules

  1. alf says:

    Wouldn’t it be easier just to build commenting/annotation into CrystalEye?

  2. Egon says:

    Peter, the code is just a piece of JavaScript, so you can make it ‘server side’ too. I have done that earlier for Sechemtic, and can easily be done with this code. I would love to help set it up, it’s really easy. See my blog on ‘server side Greasemonkey’:
    http://chem-bla-ics.blogspot.com/2007/01/chemistry-in-html-javascript-from.html

  3. pm286 says:

    (3) Suggest you contact Jim and Nick. We’ll be discussing this in group meeting today. Do you have any thoughts on what might be added to CrystalEye. Substructure search is obvious as is data search.

Leave a Reply

Your email address will not be published. Required fields are marked *