Open Crystallography: The Hargreaves report can help make CCDC data Open

I have had a really useful suggestion about how to make data deposited with the Cambridge Crystallographic Data Centre (CCDC) Open. The Hargreaves report has recommended that text- and data-mining of the scientific literature should be allowed and the government agrees [see below]. It is therefore likely that the data in CCDC fall under data-mining. Since a major user of the CCDC data is the pharmaceutical industry, it clearly falls under “medical”. I give Pete Carroll’s suggestion in full, and add my comments [emphasis is mine].

Pete Carroll says:

August 29, 2011 at 11:26 pm  (Edit)

I wonder if the Government response to the Hargreaves Review regarding data/text mining for research might be relevant?

“Nor does the Government regard it as appropriate for certain activities of public benefit
such as medical research obtained through text mining to be in effect subject to veto by the owners
of copyrights in the reports of such research, where access to the reports was obtained lawfully
. We
recognise that some publishers view licensing of text mining as a legitimate commercial opportunity;
however we are not persuaded that restricting this transformative use of copyright material is
necessary or in the UK’s overall economic interest…

the Government agrees with the Review’s central thesis that the widest possible
exceptions to copyright within the existing EU framework are likely to be beneficial to the UK

The UK government can therefore be argued to be in favour of our obtaining Open data from the CCDC

…subject to three important factors:

That the amount of harm to rights holders that would result in “fair compensation”
under EU law is minimal, and hence the amount of fair compensation provided
would be zero. This avoids market distortion and the need for a copyright levy
system, which the Government opposes on the basis that it is likely to have adverse
impacts on growth and inconsistent with its wider policy on tax.

The CCDC advanced two main arguments for non-release. One was economic (it would hurt their business), the other was that third parties had rights over the data. On the first I believe that harm is minimal as the raw data has not had value added by CCDC and that their income comes from added value and independent products.

• Adherence with EU law and international treaties.

• That unnecessary restrictions removed by copyright exceptions are not re-imposed by
other means, such as contractual terms, in such a way as to undermine the benefits of
the exception.

The Government will therefore bring forward proposals in autumn 2011 for a substantial
opening up of the UK’s copyright exceptions regime on this basis. This will include
proposals for a limited private copying exception; to widen the exception for noncommercial
research, which should also cover both text- and data-mining to the extent permissible under EU law..”


The parliamentary select committee for the dept of Business Innovation & Skills is holding an inquiry on the Hargreaves Review and the Government’s response to the review. Closing date for submissions 5th September. See:

I know time is short but it could be worth yourself or someone from the research community bringing this problem of “closed CIFs” to their attention as exemplary evidence of problems with access to data.

PS good luck with your FOI request. You might find

useful to you if they try a S41 or S43 exemption over release of the contracts.

Thanks. I am hoping that they will not “try any exemptions”. They are part of our wider community and it is my hope that they will see the positive value of opening their data and that this raise their public esteem. I do not want a battle – I would much rather see genuine reorientation of approach. But the solutions must be Open and they must be rapid. If there are problems I shall certainly approach the ICO (Information Commissioner’s Office) who have been very unsympathetic to scientists hanging onto to data which should be in the public domain. I think that in practice any prolonged refusal to want to provide Open data will be tried the in the court of public opinion both within the scientific community and beyond. But I hope there will not be a “Crystalgate”.

If they take heed of this and wish to make their data Open, then the only barriers will be from contracts imposed by third parties. Until they provide this information we do not know whether it is a problem. If it is, it may be that Hargreaves is a useful weapon.


This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *