#quixote #xmlcml
I am delighted to announce the first Quixote Conference http://quixote.wikispot.org/First_Quixote_Conference_-_22nd-23rd_March_2010 at Daresbury Laboratory. This is the outcome of all the work put in by the Quixote community and is
A meeting to create the first Open distributed repository for electronic simulations
To explain a bit further. There are zillions – probably at least 10 million – computational chemistry calculations “published” each year (i.e referred to in scholarly publications) but almost no data is publicly available. Comp chem is 50+ years old, it’s very well understood, and almost no data is published. [There are some collections – including our own DSpace @ cam – or log files and derived data but it’s << 1% of what is published].
So Quixote intends to change this. We’ve been building the components, and now we intend to bolt them together. Essentially we have the following components:
- Lensfield/Quixote – a tool to crawl your disks for compchem
- JUMBOConverters – tools to transform the legacy files into XML-CML
- CMLDictionary – a formal semantic method of describing the data
- Chempound – a repository for indexed numeric and chemical data
- Avogadro – a flexible GUI for navigating and transforming the system.
- CompChemPub [vapourware] a tool to collect the results into a scholarly publication. To be created during the coming hackfest
The strategy is on a per-code basis. So let’s say your code is called Foochem. Its input is something like:
- Molecular/crystal/surface atoms and coordinates
- Basis sets and/or pseudopotentials
- Parameterisation (level of theory, accuracy, etc.)
- Physical constraints (pressure, field, etc.)
- Strategy – what to calculate (energies, frequencies, wavefunctions…) and how to do it (algorithms)
And its output should retain all this and also include:
- History of calculation (e.g. optimisation)
- Final calculated coordinates and electronic properties
- Other properties
To create this information needs (at least):
- A Foochem dictionary
- A Foochem output parser
- (possibly) a Foochem input parser
- Some Foochem examples
So we are inviting experts in various codes. So far we have NWChem, QuantumEspresso, GAMESS-UK, GAMESS-US, DALTON, Turbomole, Gaussian. We hope to create dictionaries for them, parsers and documentation. This does not need to be complete – the parsers and XML-CML can be expanded when people have time and energy or a really boring cricket match.
It’s a hands-on meeting. You need to be reasonably proficient at running the software (i.e. you may need a few days’ in advance). If anyone is interested, let Jens Thomas know. I think there are some places but it’s up to Jens and colleagues at STFC.
Lots of thanks to lots of people.