We (mainly Cameron Neylon and me) ran a session this morning on Open Data. These are un-sessions which need preparation but not a strict agenda. Certainly not a lecture. So we kicked off very briefly with the scene and moved to the Panton Principles on what scientists want to do in publishing data for the benefit of the community.
In very simple terms:
scientists want their data to be available to anyone and re-usable for any purpose without explicit permission.
The only requirement is that the source of the data be acknowledged.
Any further “constraints” are set by community norms in the particular domain. Those might involve human data, need for validation and data integrity, etc. Adherence might be a condition of funding. But they are set by the community, not by the author through a licence.
We’d anticipated that there would be some suggestions that commercial use could be forbidden. In fact there was none and we take great heart from this. We are all convinced that “non-commercial” restrictions (e.g. CC-NC) cause enormous problems. They propagate through the data chain. They are unclear (what is commercial – teaching? Books? It’s impossible to say).
People sometimes say “don’t you risk getting ripped off by someone who takes your Open Source code or Open Data and sells it?” The answer is emphatically NO. The whole of the Blue Obelisk will agree with this stance. To reiterate:
Someone can take my Open Source and incorporate it into a commercial program. I am quite prepared for this to happen. The condition is simply that they must acknowledge the source. They must not pass off the work as their own (I have had this happen and it made me very angry). But commercialisation is – in principle – a good development. It leads to a successful economy – we need the revenue streams. It may convince those who evaluate my work that it has additional merit (it may not, of course). Similarly is the data is valuable then products may be built on top of that. Again the developer must honour the source of the data. And in all cases there can be no backwards restrictions on the freedom of anyone to use the Open Source and Open Data in whatever directions they wish.
We got hung up a bit on “what is data?”. I think this will work itself out, so long as commercially interested parties are not allowed to draw the line. It’s critical that academia and funders and learned societies (limited to those without financial interests) evolve practices that create workable boundaries.
Of course it will become much easier when everything is Open Access. That’s my personal motivation-I spent too long today discussing with people about what is data, because they have to defend their business.
And a splendid surprise. Creative Commons were here and John Wilbanks joined us for lunch. John’s talking in London on 22nd and coming to the Panton the next day. Watch this blog…