Government data policy puts scientists (and publishers) to shame

I am sure we all moan about governments and how difficult it is to find information and how they are filled with Sir Humphreys who want to fudge everything. But there’s a real spirirt of making public government data OPEN. One of the refreshing things about working with the OKF is you see how other sectors behave, and believe me, governments are making scientists and academia look like Luddites. The UK government is up with the best. Read Chris Taggart’s blog post ( ) – he’s worked with government departments and knows the issues, bureaucracy, etc. He finishes:


Open data is no silver bullet, and won’t on its own solve these problems, but it is an essential requirement for a ‘more open, more fair and more prosperous‘ society.

Fortunately the consultation provides such a set in Annex 2 of the consultation (The Public Sector Data Principles). These should be issued to every government department, quango, health authority and public sector body (including the PDC), with the order to follow them in letter and spirit. Backing these up, we also need an independent body needs to be appointed with the power and resources to enforce them. With these two things – good public principles, and an effective enforcer – we have a chance to achieve the innovation and fairer society we need.


Chris Taggart

CEO & Co-Founder OpenCorporates

Founder OpenlyLocal

Member of Local Public Data Panel

Member of Mayor of London’s Digital Advisor


The annex referred to is: . When you read it think also of the Panton Principles ( ). Similar! That’s because in all fields the first thing to do is state what you want to happen, and then see how you can make it happen. Same as in the Open Bibliographic Principles ( ). So what I have done is taken the Open Data principles from government and change “public” to “science” and similar changes. I don’t have a strike through, so I’ve included the original . Wherever it said Public data I changed it to [Science][Public] data or similar.

Writing this made me weep. That I have been urging this for 2 decades. Look at the stuff about W3C standards. Look at the stuff about linked Open Data. About Machine understandable export. About… (but I cannot go on).

That I belong to a community – scientists and academia – who care so little about the culture of the twenty-first century that governments put them to shame.

It is really uncanny how I can change “Public” to “Science” and everything makes sense. Yes, the naysayers will say there is no money and scientists shouldn’t be bothered with this – they should be doing real science and not pratting around with Linked Data.

But unless scientists and academia in general agree that this is a good thing to do, then we shan’t get anywhere. We’ll live with “all your data are belong to us” definition of “[Science] [Public] Data”

“[Scientific] [Public] Data” is the objective, factual, non-personal data on which [science] [public services] run and are assessed, and on which policy decisions are based, or which is collected or generated in the course of [scientific research] [public service delivery].

Draft [Scientific] [Public] Data Principles

  1. [Science] [Public] data policy and practice will be clearly driven by the public and businesses who want and use the data, including what data is released when and in what form – and in addition to the legal Right to Data itself this overriding principle should apply to the implementation of all the other principles.
  2. [Science] [Public] data will be published in reusable, machine-readable form – publication alone is only part of transparency – the data needs to be reusable, and to make it reusable it needs to be machine-readable. At the moment a lot of science [government] information is locked into PDFs or other unprocessable formats.
  3. [Science] [Public] data will be released under the same open licence which enables free re-use, including commercial re-use – all data should be under the same easy to understand licence. Data released [as supporting info or embedded in text ] [under the Freedom of Information Act or the new Right to Data] should be automatically released under that licence.
  4. [Science] [Public] data will be available and easy to find through a single easy to use online access point [???] [(] – the [science] [public] sector has a myriad of different websites, and search does not work well across them. It’s important to have a well-known single point where people can find the data.
  5. [Science] [Public] data will be published using open standards, and following relevant recommendations of the World Wide Web Consortium. Open, standardised formats are essential. However to increase reusability and the ability to compare data it also means openness and standardisation of the content as well as the format.
  6. [Science] [Public] data underlying [scientists] [the Government’s own websites] will be published in reusable form for others to use – anything published on [scientific] [government] websites should be available as data for others to re-use. [Scientific] [Public] bodies should not require people to come to their websites to obtain information.
  7. [Science] [Public] data will be timely and fine grained – Data will be released as quickly as possible after its collection and in as fine a detail as is possible. Speed may mean that the first release may have inaccuracies; more accurate versions will be released when available.
  8. Release data quickly, and then re-publish it in linked data form – Linked data standards allow the most powerful and easiest re-use of data. However most existing internal public sector data is not in linked data form. Rather than delay any release of the data, our recommendation is to release it ‘as is’ as soon as possible, and then work to convert it to a better format.
  9. [Science] [Public] data will be freely available to use in any lawful way – raw [scientific] [public] data should be available without registration, although for API-based services a developer key may be needed. Applications should be able to use the data in any lawful way without having to inform or obtain the permission of the [science] [public] body concerned.
  10. [Science] [Public] bodies should actively encourage the re-use of their [science] [public] data – in addition to publishing the data itself, [scientific] [public] bodies should provide information and support to enable it to be re-used easily and effectively. [Science] [The Government] should also encourage and assist those using [science] [public] data to share knowledge and applications, and should work with business to help grow new, innovative uses of data and to generate economic benefit.
  11. [Science] [Public] bodies should maintain and publish inventories of their data holdings – accurate and up-to-date records of data collected and held, including their format, accuracy and availability.

Available at [nowhere] []

at the stuff abo

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *