#jiscxyz #quixote
Every day brings more interest and excitement in Quixote. We are getting groups who are interested in using it (a) for education – managing student experiments and projects and creating Open educational resources and (b) publishing. Here I talk about (b).
Many disciplines require publication of supplemental data. Computational chemistry is somewhat half-hearted about this but there is a reasonable amount published in the Society journals. I’m talking with Quixote members about how the system can help the publication process. In principle Quixote can do much of the local management of the input and output files of compchem. That will be supplemented by the EmMa-Chem# (“chem-pound”) repository system that we (Sam) is developing in JISC-CLARION. So Quixote will be able to use this RSN. The challenge comes when the material is published.
Some publishers like repositories such as PDB, Genbank, Tranche, DRYAD, etc. But there is nothing in compchem (that’s the reason for starting Quixote, after all). The publishers require PDFs. It makes them feel happy. It has several drawbacks:
- It takes quite a lot of work to create the PDFs. That has to be done by unpaid slaves (graduate students)
- It introduces errors, which corrupt the data.
- It makes the result unusable and therefore uncheckable.
So here is your homework. I asked one of the collaborators to send me 6 DOIs with supporting info to get some idea of what we would have to create. This is the first one – I expect the others to be of similar/worse quality. Use the URL http://pubs.acs.org/doi/suppl/10.1021/ol1002384/suppl_file/ol1002384_si_001.pdf (it’s freely visible and I claim it’s Openly reusable without permission).
- What paper does it relate to? If you saved this file on your hard disk would you be able to answer the question in 6 months? How?
- What are molecules 2a, 2b, 2c, 2d? How would you find out?
- What compchem program was used to create the data? How did you find out?
- Which table has corrupted numeric information during the cut and paste so seriously that it requires careful hand-editing to recreate the correct version?
- Which scientific units have been corrupted by the cut-and-paste?
- Which numeric/scientific quantities can only be extracted by retyping them?
- How long do you think it took the authors to create this document? We will attempt to be considerably quicker in Quixote as well as more accurate
- Would it be possible to re-run the calculation from the material present? If so how long would it take you to prepare the jobs?