We were very pleased to be told recently we had been awarded a grant from JISC for repository enhancement. It’s CLARION (Chemical Laboratory repository In/Organic Notebooks) and the JISC page is https://pims.jisc.ac.uk/projects/view/1276. We’re in the process of uploading a project description to JISC but here’s some more informal background…
We believe that most chemistry data in most departments is valuable to science. That sounds like a platitude, but it’s a tribute to the standards of training in chemistry and to the standards of those who develop chemical protocols and instrumentation. Chemists care about quality and a single spectrum or crystal structure can be the ugly fact that slays a beautiful hypothesis (Huxley). These facts are – largely – reproducible so that the same substance in different laboratories will give “the same analytical data” (crystal structure, spectra, composition). Of course there are exceptions, but by and large this works extremely well. To the extent that the publication process increasingly requires these data to be made available to reviewers and to readers.
And the data are born-digital. They come out of machines as reproducible numbers. The semantics are not always explicit but they can usually be added if done by the author. But all too often the data are emitted as unsemantic PDF, printed on paper, scribbled with pencil, covered with coffee-mug rings and then published as some ugly bitmap. The poor reader then has to measure the peaks with a ruler.
I repeat. We are in the twenty-first century and we still use rulers.
That’s because the data publication process is not yet developed. Perhaps I should say data publication culture. Because the tools are all there. We’ve done this for the whole of the department’s crystal structures and put them in a repository (C3DER).
The structures are not yet all exposed as we need agreement with the researchers. I’m sure this will be forthcoming readily – many have said it gives them a warm fuzzy feeling to make their data available. Usually it has to be done after publication (we don’t expect everyone to adopt open Notebook yet) and this needs culture and process.
So an important part of CLARION will be developing the means for working with scientists to expose their data at the appropriate time. CLARION will expand to include a variety of spectral data, both from central analytical services and from individual labs.
Another key aspect of CLARION is that we shall be integrating it with a commercial electronic laboratory notebook (eLNb). We’re in the process of evaluating offerings and expect to make an announcement soon. This will be a key opportunity to see how feasible it is to integrate a standard system with the needs of a departmental repository. The protocols may be harder but we’ll have the experience from the crystallography band spectroscopy.
An important aspect is that we are keen to develop the Open Data idea globally and we’s be very interested from other groups who are doing – or thinking of doing – similar things.
This blogpost was prepared with ICE+OpenOffice.