I normally try to blog some of my presentations before the event so that at least there is some sort of record. It also allows for feedback from readers. So I’m talking on Open Data in Chemistry. I’ve been working very hard to create a new demo of the future that chemistry could have if it wished and I think it’s working. I believe – not surprisingly – that every publisher of chemistry should look at it carefully. Because it helps to change the shape of the technical chemical publishing.
The theme is “Open Data”. I’ve recently written a review of this in Elsevier’s Serials Review and it’s coming out RSN in a special issue on Open Access. It’s already on Nature Precedings. So if you want detailed aspects – a few months out of date, they are there.
Some bullet points:
- Data are different from text. Open Access generally does not support data well (I make exceptions for ultra-strong-OA such as CC-BY and BBB-compliant. Of the sort that PLoS and BMC provide. Green Open Access is irrelevant to Open Data (I think it makes it harder, others disagree).
- Data matter. Chemistry is a data-rich science. We throw away over 90% of our data. We are all part of the problem, but publishers are one of the worst places for data loss.
- Data must be made available by the authors. It’s now simple to do this. There is no technical excuse for not publishing chemistry data.
- Data must be Open. It’s that simple. It can be done independently of Open Access
- Data should be semantic. That’s harder but it’s happening. In our group we are producing the next generation of semantic tools for chemistry. I had hoped to announce a very important new project but the legal details haven’t been completely signed off. You’ll see it first on this blog.
- Graduate students can – if they wish – provide semantic theses that can be checked and enhanced by machines. Free of many common errors.
What of the future? I’m not going to talk about business models or the rights and wrong of Junk Science through government mandates. But I should make it clear that:
- Closed access is harmful to chemical data. That’s a fact, not a political stance. We are 10+ years behind other data-rich sciences because we protect data in archaic silos.
- Publishers have to choose, one way of the other. “Mumble” is no good. Either you are an enthusiastic publisher of Open Data or you are a closed publisher. Your choice.
- The formal aggregators (Chemical Abstracts, Inorganic Crystal Structure Database, Cambridge Crystallographic Data Base) will see their market and importance steadily decline. I predict that in 5 years’ time there will be no role for ICSD in its current form. The CSD may follow. Chem Abs will survive, but in form marginalised from the main web.Unfortunately at the moment several publishers (Wiley, Elsevier, Springer) do not expose crystallographic data and sent it to the data centres where we have to pay to get it out. This type of restrictive practice harms chemistry – I shall show how – and will be increasingly difficult to defend. Unfortunately when I write to these publishers they simply don’t reply.
So that’s what I intend to say. Roughly. And show some demos of the publication of the future. The datument.
Enthusiastic publishers can make it happen and chemistry is the best subject. Or you can delay it. You’ll succeed in delaying it, biut eventually it will happen.
Pingback: How Is HE Embracing Web 2.0? How Is Web 2.0 Changing HE? « UK Web Focus