I spent the day yesterday in Porto with the COST initiative on sharing HLA data (which exchanges data on human antigen typing, value for – say – pharmocogenomics or immunogenetics, including European migration). They’d invited three speakers – 2 on ethics and me on Open Data. It was a great meeting and as always it’s great to see (17) European countries collaborating rather than bombing each other.
I took the opportunity to rehearse ideas of Open Data (as something orthogonal to ethics – e.g. human privacy) that apply to any science. And I stressed we should JUST concentrate on science – not a wider vision of knowledge or creative works.
When I give presentation like this I ask those present what their understanding of the background. There were about 25, and I asked how many present:
-
had published an Open Access paper (3)
-
had contributed to an Open Source program (1)
-
had heard of Open Street Map (1)
-
had used a Creative Commons licence (2)
I am not disappointed by these figures – the group clearly wanted to do the best thing. They wanted to develop interchange standards. They wanted to build open databases. They wanted to create data-only journals. They wanted to mashup population genetics on interactive maps.
I therefore gave a presentation which stressed simplicity. That’s what scientists want. So I presented the Panton Principles (and I am DELIGHTED to see John Wilbanks giving the idea full support):
The idea now is to rework the simple statements such that they are trivially understanding the principles (like anyone can understand the Budapest Open Access declaration immediately). The following words should not occur anywhere:
-
licence
-
contract
-
share-alike
-
public domain (because no one know what it means)
Simply, the Principles should state that scientists wish to donate their data to the world community for any purpose and with no requirement other than attribution. That further use in a domain is regulated by the Community Norms in the domain (which will vary widely). That funders should mandate this. That anyone who offends against the spirit of this (as it is the spirit, not the letter) will have to answer the court of Community Opinion in their domain. That there will be a simple act of stating this intention in an electronic document, hopefully provided by software.
That’s all. We have to craft some words but they should be simple enough to fit in a single paragraph. It won’t be easy but John Rufus and others have been discussing this for many months. If you have an insight join the discussion.
@Physchim62 thanks – are you suggesting this should be added.
I have also drafted the following “Protocol X” for discussion:
All data MUST be obtained in such a way that it it can be communicated to other people if necessary, even if, by its nature or by local legal restrictions, it cannot be published.
All data, including “negative” data, MUST be recorded in a permanent form as soon as is reasonably possible. The record SHOULD, as a minimum, include the date on which the data was collected and (if different) the date on which it was recorded.All data MUST be made available to external assessors, so long as any necessary conditions of confidentiality are met, and subject to local legal restrictions.
Publication of a scientific paper implies making all the data needed to prepare the paper available to external assessors, under conditions of confidentiality where necessary. It also implies making all non-confidential data available to the general public for any use.It is implicit that scientific practice forbids the publication of partial data (and the concurrent interpretation) when the authors have data which would contradict that interpretation.
I was suggesting that my contribution might be your “one paragraph”. On rereading, I think I mistook your idea, and so I propose Protocol X instead. My two sentences would still serve to mark data which is published according to my “Protocol X”.
@Physchim62 Thanks very much for proposing. I suggest you post these to the okfn-discuss mailing list. My immediate comment is that you have linked these closely to the act of publication – which is fine; our principles are meant to apply to any openly visible publicly funded data.