John Marks (ESF) introduced our session and set the scene on the need for Open Data and sharing. He stated strongly that it was essential that we had discipline-specific repositories for different branches of science. I share this view and blogged it recently (berlin5 : how to progress Open Data?).
My stance comes from meetings this year where I have talked to many people about institutional repositories. I ask them “why are you setting up an IR?” I have got about 8 distinct answers. Very few of them mention data.
Some of us addressed these issues at ETD2007. There are hundreds of different types of biologiocal data, tens of chemistry data, humderds of geoscience, etc. There is no way that these managers – with the best will in the world – will know how to manage them. So I wrote:
although there is quite a lot of activity in institutional digital repositories they won’t (and shouldn’t) address Data. It’s subject-specific and too complex for the average repository manager.
PMR: In response to this Dorothea Salo (who has run Caveat Lector blog for some years and has a strong following).
PMR: I haven’t met Dorothea but I’d like to – her blog is insightful and entertaining and she is unafraid to speak out. She’s also technically proficient in the IT skills required – XML, etc. And the last thing I want to do is upset and antagonize people like Dorothea.
But… There is no single human on the planet who knows how to reposit all of protein structures, variable stars, ice sheets, chemical structures. It needs much more than metadata. So what can a repository manager do. Putting the raw data into the repository without understanding it is not an option. It has to go into a system devised by experts in the discipline. And, for me, that means subject repositories. Maybe each university has a different one. Maybe they are national.Some, like the bioscience ones, will be international.