Welcome to a new blog (Research Remix) from Heather Piwowar, currently doing her PhD in Biomedical Informatics at the University of Pittsburgh. Heather is encountering first-hand the difficulty of doing her research because of the problem of getting access to data. So she’s taking a very systematic approach to analysing the problem. Here’s a typical post
Don’t you love to experiment? Me too.
This blog is an experiment. I’m starting my PhD literature review on the topic of biomedical data sharing and reuse, and thought it would be appropriate to do it out in the open.
Not quite sure how it will work: I’m new to this blogging thing. Please send me suggestions, questions, and especially links to related work.
Thanks, and happy experimenting… with your own data or that of others 🙂
One of the key tools we must have in fighting for Open Data is agreed metrics. That is hard work. It includes much disappointment – in other posts Heather mentions that many researchers don’t reply to requests for data, and many of those that do cannot (or will not) supply it. (To be fair it’s often because it is a lot more work than it might seem – among the first customers for Repositories we often find scientists who have lost their own data!).
It’s also important to realise that this data has cost money. There seems to be an assumption that once the “science” has been published the data are then worthless. That’s usually not true, but even if it was I think it’s useful to enumerate the actual cost of collecting the data. A useful metric is to work out what they would cost at commercial rates – if a chemistry department generates (say) 500 crystal structures at a commercial cost of (say) USD 3000 (and that’s probably underestimate) – that’s 1.5 million dollars. Does it become worthless after publication?
So we need metrics. It’s not exciting, but it’s necessary. I would like to know how how many chemistry papers are available under “Open Access/Choice” or whatever name – where the author is invited to pay the publisher so that people can read the artcile Openly. And I am interested in the publishers’ poicies on Open Data – is supplemental data Openly available. This is a sizeable task. But with modern Web 2.0 tools it should be easier to aggregate the response (or non-response) from the publishers. Suggestions and offers welcome.