RSC Meeting: Open Access to Crystallographic Data

Yesterday I received a request from a publisher. I won’t name them but I don’t think the material is sensitive, and I need your help anyway. It’s very simple

We're in the process of producing new [High School] Chemistry
teaching and learning materials
We are looking at producing a number of rotatable
molecular models as part of our digital publishing offer.
[We would like someone] who can write CML files for us.

The molecules are apparently simple:

  • neopentane and 4 other simple organic molecules
  • buckminsterfullerene
  • diamond
  • graphite
  • sodium
  • sodium chloride

The organic molecules are easy (the publisher wasn’t concerned about what conformation was provided). I actually used ChemAxon’s Marvin – which emits CML – and it took about 5 minutes. So thank you ChemAxon, who make the software available for free, though it’s not Open.
In the near future it will be different. We shall have extracted all the molecules from CrystalEye and put them in an Open Repository. We shall have added the 250,000 molecules from the NCI which we computed with MOPAC and so it’s certain we would find the molecules we wanted in there. Moreover I would expect that the Blue Obelisk will soon have a complete workflow for drawing molecules and creating 3D structures.
But where can I get a 3D structure of sodium chloride? No, it’s not in Wikipedia (which is an encyclopedia, not a data- or knowledge-base). How long would it take you? And, before you think it’s simple remember it’s for commercial use. You will have to negotiate with the supplier of the information to determine whether you are allowed to redistribute the derivative work. Yes, it’s only data and data shouldn’t be copyright, should it?
So my question to the blogosphere is:

“where can I get redistributable coordinates for the last 5 substances? and how long did it take you to get them and assure me that they can be re-used for commercial purposes”

I expect buckminsterfullerene to be fairly easy. I think I have an answer for the solids. It took me about 10 minutes of half-remembered browsing by a strange route. I’ll accept coordinates in CIF, PDB or CML (no other format I know of supports crystallography and we need the space group and cell dimensions). As a second best I’d accept a filled-out unit cell with Cartesian coordinates. But the coordinates aren’t the problem. Finding the structures is. Please reply and tell me how long it took. Remember I’m not interested in pictures, only coordinates. And I didn’t get any joy with the ICSD database of inorganic crystal structures (http://icsd.ccp14.ac.uk/icsd/icsd_help.html). I may not have navigated through the complex interface but I only got:

Access forbidden!

You don’t have permission to access the requested object. It is either read-protected or not readable by the server.

The sodium chloride was determined about 100 years years ago. It’s in the public domain. But where can I get it?
The publisher has offered me a fee. I don’t know how much, but I will suggest they donate it to support education in Africa unless anyone has a better idea.


											
This entry was posted in Uncategorized. Bookmark the permalink.

2 Responses to RSC Meeting: Open Access to Crystallographic Data

  1. justme says:

    OK, you can get elemental Na and graphite CIF files from
    http://rruff.geo.arizona.edu/AMS/amcsd.php
    and the rest from here
    http://www.chem.lsu.edu/lucid/maverick/file-lst.htm
    time taken = 1 cup of herbal tea

  2. pm286 says:

    (1) Many thanks. I was aware of the Am. Mineral site as they donate a lot of structures to the Crystallography Open Database.

Leave a Reply

Your email address will not be published. Required fields are marked *