Yesterday I received a request from a publisher. I won’t name them but I don’t think the material is sensitive, and I need your help anyway. It’s very simple
We're in the process of producing new [High School] Chemistry teaching and learning materials We are looking at producing a number of rotatable molecular models as part of our digital publishing offer. [We would like someone] who can write CML files for us.
The molecules are apparently simple:
- neopentane and 4 other simple organic molecules
-
buckminsterfullerene
-
diamond
-
graphite
-
sodium
-
sodium chloride
The organic molecules are easy (the publisher wasn’t concerned about what conformation was provided). I actually used ChemAxon’s Marvin – which emits CML – and it took about 5 minutes. So thank you ChemAxon, who make the software available for free, though it’s not Open.
In the near future it will be different. We shall have extracted all the molecules from CrystalEye and put them in an Open Repository. We shall have added the 250,000 molecules from the NCI which we computed with MOPAC and so it’s certain we would find the molecules we wanted in there. Moreover I would expect that the Blue Obelisk will soon have a complete workflow for drawing molecules and creating 3D structures.
But where can I get a 3D structure of sodium chloride? No, it’s not in Wikipedia (which is an encyclopedia, not a data- or knowledge-base). How long would it take you? And, before you think it’s simple remember it’s for commercial use. You will have to negotiate with the supplier of the information to determine whether you are allowed to redistribute the derivative work. Yes, it’s only data and data shouldn’t be copyright, should it?
So my question to the blogosphere is:
“where can I get redistributable coordinates for the last 5 substances? and how long did it take you to get them and assure me that they can be re-used for commercial purposes”
I expect buckminsterfullerene to be fairly easy. I think I have an answer for the solids. It took me about 10 minutes of half-remembered browsing by a strange route. I’ll accept coordinates in CIF, PDB or CML (no other format I know of supports crystallography and we need the space group and cell dimensions). As a second best I’d accept a filled-out unit cell with Cartesian coordinates. But the coordinates aren’t the problem. Finding the structures is. Please reply and tell me how long it took. Remember I’m not interested in pictures, only coordinates. And I didn’t get any joy with the ICSD database of inorganic crystal structures (http://icsd.ccp14.ac.uk/icsd/icsd_help.html). I may not have navigated through the complex interface but I only got:
Access forbidden!
You don’t have permission to access the requested object. It is either read-protected or not readable by the server.
The sodium chloride was determined about 100 years years ago. It’s in the public domain. But where can I get it?
The publisher has offered me a fee. I don’t know how much, but I will suggest they donate it to support education in Africa unless anyone has a better idea.
OK, you can get elemental Na and graphite CIF files from
http://rruff.geo.arizona.edu/AMS/amcsd.php
and the rest from here
http://www.chem.lsu.edu/lucid/maverick/file-lst.htm
time taken = 1 cup of herbal tea
(1) Many thanks. I was aware of the Am. Mineral site as they donate a lot of structures to the Crystallography Open Database.