I have sent the following email to Elsevier’s Director of Universal Access (“very passionate about expanding access to information”). In summary I request that Elsevier publish all supplementary information (past, present, and future) and that by 27th Feb she gives me an unequivocal commitment that this has happened.
Dear Director of Universal Access,
I have been invited by Columbia University, NY, to give an opening keynote at their “Managing Research Data” symposium on Feb 27th http://library.columbia.edu/news/libraries/2013/2013-1-31_Research_Data_Sympsosium_Announced.html ). Elsevier is among the sponsors, though (at my request) not of me. Among the recommendations I shall be making is that all primary research data should be published under CC0 (or equivalent) licence which allows anyone anywhere to do anything with it for any legal purpose without permission or negotiation, to re-use, modify, copy and repost. In my mind this is what “Universal Access” means.
This letter is to ask Elsevier, through your department, to make all supplemental data accompanying Elsevier publications , retrospective and future, available under CC0. I will treat all mail from you as public and announce your reply/s at the symposium.
I will restrict my examples to small-molecule crystallography though the argument extends to all primary scientific data (observations, instruments, computation, etc. in all disciplines). Crystallography, through its International Union (IUCr) has pioneered the imperative to publish all primary data (diffraction, cleaning, structural solution and refinement – and more). FWIW I am privileged to sit on the IUCr’s COMCIFS committee which creates the protocols for this.
Note that other major publishers (Nature, Acta Crystallographica, ACS, RSC, etc.) have no problem making their data available in the way I have described.
This publication enables many things including:
The verification/validation of the experiment being reported. There are many ways of doing this including reprocessing the data with new algorithms, comparison with other data sets, recomputation, etc.
The re-use of the data to build knowledgebases both in and outside the domain. Crystallography has a century of showing the value of the re-use of data and its interpretation.
Creating of specialist services for alerting scientists to the publication of data.
As an example Nick Day in our laboratory collected 200,000 structures from the primary literature in http://wwmm.ch.cam.ac.uk/crystaleye. This resource, published under PDDL (equivalent to CC0) contains several features not found elsewhere including bondlength browsing and fragment browsing. In particular it has a unique feature of linking back to the original literature.
There are no Elsevier data in this, because Elsevier makes it impossible. Elsevier currently hides this behind a 42 USD paywall (Polyhedron) or – in a closed agreement with The Cambridge Crystallographic Data Centre (CCDC). I have no details of this agreement (CCDC refused to respond to my FOI request) but it gives a monopoly right to CCDC to be the holder of this data. CCDC sell a derivative product and only allow miniscule amounts of the data (ca 25 structures per year) on request. This is completely inadequate for what modern information-based scientists wish to do. It leads to bad science as the primary data cannot be reviewed and cannot be incorporated in new artifacts (CCDC forbid re-use of the data even though it is the primary scientific record).
I am therefore asking you do the following:
Announce that all supplemental data accompanying Elsevier papers IS licensed as CC0.
Require the CCDC to make all primary CIF data from Elsevier publications CC0. (The author’s raw deposition, not CCDC’s derivative works)
Extend this policy to all other experimental data published in Elsevier journals (in chemistry this would be records or synthesis, spectra, analytical data, computational chemistry, etc.). When you agree to this I can give public advice as to the best way to achieve this.
I assume your division has effective power to do this on the timescale I have indicated. Note that in our past discussions you have used phrases such as “let’s talk to your librarians”, “we are reviewing this internally”, etc.) Any phrases of this sort will be interpreted as a refusal to make data CC0. Only a clear public commitment to make raw author data CC0 with target dates (e.g. within a month ) and an unequivocal public letter to CCDC requiring CC0 for raw CIFs can be regarded as Universal Access to raw author data.
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK