Noel O’Blog has suggested that I should use Rajarshi Guha’s CDK service to layout the Diazonamide structure (see my post Finding chemical structures – InChIs et al., an amusement)
PMR: so here it is:
PMR: I think it’s correct. Interpretable. I’d put it on the same level as the Daylight one. One message is that it is difficult for software to layout structures with a 10-ring nucleus.
The point is that CDK is Open Source and can therefore be enhanced by the community. Daylight and the software that Pubchem (?Cactus?, ?Openeye?) use isn’t. CDK is joint leader, and we can improve it.
A complementary approach is to start making collections of human-drawn images. The intelligible Chemspider image was hand-drawn by the PNAS authors – I don’t know how it got to Chemspider. (Personally I think it’s pretty awful – I do not like stereo bonds which are rectangular rather than wedges. Why do people use them. And You only have to scale the image to corrupt this info). So we need an Open collection of chemical structures.
This is not technically difficult but is lathered with copyright madness. Can I reproduce a chemical structure from Nature without permission? I’ve asked but they haven’t got back to me. Can I reproduce a chemical structure diagram from Wiley? I’ve asked but… … they haven’t got back to me.
It has to be fully Open. Every structure diagram has to be copyright-free and accompanied by metadata that gives provenance and alternative descriptions (names, InChIs, etc.). Is there anywhere that has chemical images that I can download that fulfils all these permissions?
I’ve found one (sorry for the layout). Here’s taxol:
Paclitaxel
β-(benzoylamino)-α-hydroxy-,6,12b-bis
(acetyloxy)-12-(benzoyloxy)-2a,3,4,4a,
5,6,9,10,11,12,12a,12b-dodecahydro-4,11-
dihydroxy-4a,8,13,13-tetramethyl-5-oxo-
7,11-methano-1H-cyclodeca(3,4)benz(1,2-b)
oxet-9-ylester,(2aR-(2a-α,4-β,4a-β,6-β,9-α
(α-R*,β-S*),11-α,12-α,12a-α,2b-α))-
benzenepropanoic acid
And there’s lots of data with it that looks like this:
I’ll leave you to guess where this is. Clues: It’s Open, re-usable, very highly curated, and the first place that students look. That – or a derivative – is where the world’s chemistry should reside.
September 24th, 2007 at 7:59 am eFor the record, you can compare with CDK’s SMILES to 2D at:
http://cheminfo.informatics.indiana.edu/~rguha/code/java/cdkws/cdkws.html#sdg