Although repeatability has always been a key part of formal scientific procedure we are now finding several new tools to help us. In principle we can capture every moment of the scientific process and “replay” it for others. Here is Richard Akerman picking up on my summarization of many in the (chemical) blogosphere and asking whether we can add better metadata about reproducibility.
the peer review logo
In the session “Reinventing scientific publication (Web 2.0, 3.0, and their impact on science)” led by James Hendler at SciFoo, one of the items was an idea from Geoffrey Bilder, for publishers to provide a “peer review logo” that could be attached to (at this point I am interpreting based on my own understanding) e.g. blog postings, some sort of idea of a digital signature to indicate peer reviewed content. (I know the list well since I’m afraid my major contribution to the evening, despite having thought about this topic a lot, was transcribing the list).
2) ID, logoing, review status tag, trust mechanisms
– other peer review status
I wonder if we should make a wiki where we list all of the grand (and not so grand) challenges of web science communication and discovery, and then people can pick off projects. The SciFoo prototypes list is one angle on this. Of course, in the perpetual-beta web world, it’s probably faster to just create a wiki, than to try to start a discussion about whether one should be created. It’s in that “just do it” spirit that I’m pleased to find there is already a peer review logo initiative in the works, although the angle is to indicate that you’re writing about a reviewed work, not that your work itself has been reviewed. From Planet SciFoo:
Cognitive Daily – A better way for bloggers to identify peer-reviewed research, by Dave Munger[we] have decided to work together to develop such an icon, along with a web site where we can link to bloggers who’ve pledged to use it following the guidelines we develop
via Bora Zivkovic, via Peter Murray-Rust
(it’s strange and also good to be blogging now about people that I’ve finally met)
UPDATE: I do have a vague idea in a similar space, which would be a “repeatability counter”.
As I have learned more about peer review, I have understood that it has many aspects, but preventing fraud is not one of them. Peer review can help to create a paper that is well-written and has “reasonable” science, but it can’t stop a determined fraudster. (This isn’t my insight, but comes from a presentation I saw by Andrew Mulligan of Elsevier – “Perceptions and Misperceptions – Attitudes to Peer Review”.) What does address fraud, and keep science progressing, is falsifiability: someone else does the experiment and sees if they get the same results. Now I realise there are many different classes of results, but it’s interesting that many of these are not publishable, and are maybe not captured in the current system:
- We tried to repeat the experiment, but it failed because we didn’t have enough information on the protocol
- We tried to repeat the experiment, but it failed and we think the paper is in error
- We successfully repeated the experiment
- (probably more scenarios I haven’t considered)
So I think it would be interesting to have a sort of “results linking service” where you would click and you would get links to all the people who had tried to reproduce the results, and indications of whether or not they succeeded. We use citation count as a sort of proxy for this, but it’s imperfect, not least of which because there is no semantic tagging of the citation so you don’t know if it was cited for being correct or incorrect. I think this kind of experiment linking might add a lot of value to Open Notebook Science and to protocols reporting (whether in the literature like Nature Protocols, or in a web system like myExperiment). Otherwise I worry that the amount of raw information from a lab notebook makes it hard to extract a lot of value from it.
I have also had a similar idea, specifically in computation-based science. Too many papers read as:
- We took the following /molecules/data/ (but we can’t tell you exactly what they are as they are confidential/licensed from a possessive publisher/in a binary format/ etc.
- we tried all sorts of methods until we found the one that works best for this particular data set. We don’t bother to tell you the ones that didn’t work
- we generated features for machine-learning using /softwareX/our magic recipe/a recorded procedure which we modified without telling anyone the details/
- we used /our own version/expensive commercial package/ of naive Bayes/support vector/monte carlo/genetic algorithm/ant colony/adaptive learning/other impressive-sounding algorithm/
- and plotted the following graph/cluster/histograms (in PDF so you can’t get the data points)
- and compared it with our competitors’ results – and – wow! ours are better.
there are hundreds of papers like this. They are not repeatable.
So I had a plan to survey a few journals and come up with an index-of-potential-reproducibility. It would indicate what couldn’t be repeated. Things like:
- how easy is it to get the original data (access, format, cost, etc.)
- how easy is it to get the software (cost, platform, installation)
- are the plotted data available?
That’s a simple index to compute. I expect it holds for many fields (substitute mouse for code, etc.). For more experimental fields the recording-based ideas of JoVE: Journal of Visualized Experiments and Useful Chemistry are obviously valuable. For software we just need a “push to re-run button”.
There was someone at the conference, once again I wish I had been better at catching names and websites, who said they indeed had a site where they ?required? open data and accompanying software, so that you could indeed “push to re-run”.