I’m going to an important and exciting meeting in January which looks at new ways of scholarly communication (https://sites.google.com/site/beyondthepdf/ ). Unlike some of the communities I interact with this one has already had huge amounts of discussion (160 messages). Many of these are concerned with the technicalities of formats. They are completely missing the point. (I am going to contribute very shortly and may well do it through this blog so as to gather more opinion).
Here’s the REAL problem.
I was reading the PNAS author guidelines and I came across this gem:
Datasets: Supply Excel (.xls), RTF, or PDF files. This file type will be published in raw format and will not be edited or composed.
Did I read those last two file formats correctly? I have actually came across a dataset in supplementary information that was several dozen pages of PDF. It was effectively impossible to extract the data from this document. (I can dig it up if pressed, probably.) I had no idea that the authors may have been encouraged to submit their data like that.
Does a premiere scientific journal actually request data to be in PDF format?
I can think of dozens of other formats that would be more fitting. They are summarized here:
What is the scholarly equivalent to a torch and pitchfork march and how can we organize such a march to encourage journals to require proper serialization formats for datasets in supplementary info? [PMR’s emphasis]
The last sentence has it absolutely right. It’s not about formats. It’s about control of the scientific process by organizations outside our control. The very fabric of this mail shows our serfdom.
We do not own our scholarship. The Antaran Stellar Society runs the communication of scholarship for the personal gain of it and its officers. The Sirius Cybernetics Library Corporation has copyrighted the Library of the Galaxy cataloguing system. It also runs it for itself and officers. The motto of these organizations is:
The only way forward for scientific publishing is to reclaim it. That’s not easy when scientific societies have sold their journals to Whitehole publishing. Major societies have abandoned their role as stewards of scholarship and turned it to maximising income.
What can we do? Currently we use publishers :
- To legitimise our reported work. But do we really any more?
- To establish priority. But the web (public or private carries date stamps).
- To moderate and correct our work. Given the appalling state of journal copy-editing and the complete disinterest in data this is one the way out
- To announce our work. But do we need publishers to do it? This blog reaches more people than the huge amount of effort I put into an invited paper for Serials Research (or something similar) that is behind a paywall and no-one reads. BTW it’s on Nature Precedings.
- To get career, grant, institution brownie points. This seems to be a fact of modern life. But publishing PDFs is a stupid way to run it. My software is evaluated by how many people compile and run it, not by who reads the source code.
- To preserve our work. Journals are not good at this. They destroy data-rich science because it increases their costs and anyway they haven’t a clue what data is. Librarians do understand this.
I except a few publishers from this and there are probably many more. The ones I am familiar with are society and community based such as Int. Union of Cryst., Eur. Geoscience U, Am. Soc. Biol. Mol. Biol. But most publishers could care neither about the author, reader or the scientific community.
So where are the pitchforks? Yes – we should and must revolt.
We can now run scientific publishing ourselves. That’s not such a difficult concept – after all it’s what we did when I started science and before the scholarly digital gold rush. And we can do it again.
In the Quixote project (http://quixote.wikispot.org/Front_Page) we are managing all our computational chemistry ourselves. Creation of the experiment, calculations, archival, analysis, and bundling for publication. We can write a paper ourselves without the barrier of a plutocratic publisher. We have enough e-presence that the world knows what we are doing and we can spread it. Google will index our work. It will index it BETTER than publishers because we know how to prepare it for machines. Why do we need Table of Contents of Galactic Science? We don’t – we can build crawlers and feeds ourselves. BETTER than anything out there.
So it comes down to a single requirement:
An independent body awarding merit for pieces of science. That’s hard. . It seems to be necessary. The current commercial and pseudo-commercial publishers do it very badly because their main motivation is not to evaluate work but to brand it to sell journals.
Everything else we can do ourselves.
And we should.
UPDATE: Lots of discussion of all flavours on : https://www.jiscmail.ac.uk/cgi-bin/webadmin?A1=ind1011&L=CCP4BB#88