There’s now a growing movement to publishing crystallography directly into the Open. Several threads include:
- The Crystallography Open Database which pioneered the idea of collecting crystallographic data and making them Openly available.
- Nick Day’s CrystalEye – aggregation of published Open structures (from journals which don’t appropriate facts)
- the eCrystals collection at Southampton, initially the repository for the National Crystallographic Service and now a JISC-sponsored project to federate crystallographic repositories.
- Other collaborative groups including Reciprocal Net and STaRBURSTT
- the Microsoft eChemistry Project and molecular repositories (see blog)
- we are getting increasing queries about our SPECTRa project.
… so it was no great surprise when Jean Claude blogged:
20:41 20/12/2007, Useful Chemistry
We have another collaborator who is comfortable with working openly: Matthias Zeller from Youngstown State University.
With the fastest turnaround for any crystal structure analysis I’ve ever submitted, we now have the structure for the Ugi product UC-150D. For a nice picture of the crystals see here.
PMR: J-C also mailed us and asked how w/he could archive and disseminate the crystallography. So here’s a rough overview.
Crystallography is a microcosm of chemistry and we encounter many different challenges:
- not all structures are Open (some not initially, some never). Managing the differential access is harder than it looks. It has to be owned by the Department or Institution. So you probably need access control, and probably an embargo system.
- Institutional repositories are not generally oriented towards data. Some may, indeed, only accept “fulltext”. So there may be nowhere obvious to go.
- The raw data (CIF) contains metadata, but not in a form where search engines can find it. That’s a important part of what SPECTRa does – extracts metadata and repurposes it.
- The CIF can, but almost universally does not, contain chemical metadata. So part of JUMBO is devoted to trying to extract chemistry out of atomic positions. Needs a fair amount of heuristic code.
So in conjunction with eChemistry and eCrystals and in the momentum of SPECTRa we are continuing to develop software for crystallographic repositories. There are several reasons why people want such repositories:
- as a high-quality lab companion – somewhere to put your data and get it back later.
- as somewhere to provide knowledge for data-driven science (e.g. CrystalEye)
- as somewhere to save your data for publication and dissemination
- as somewhere to archive your data for posterity (e.g. an IR)
These put different stresses on the software, so Jim and I are developing context-independent tools that can be used in any. I’m hacking the JUMBO software (CrystalTool) and he is hacking CrystalEye so it becomes a true repository.
This is our relaxation over the holiday.