Dictated into Arcturus
I'm waiting until the rain stops before I cycle in to work, so here is an update on The Green Chain Reaction. It's going incredibly well. The energy of those who have already volunteered is enormous, and so is the speed with which they've picked up the ideas and the competence and initiative that they have used. Absolutely incredible. An important byproduct of this experiment is to show how universal the ideas of collaboration are, and how the tools have developed over the last few years so that they are straightforward to use.
Mark and Dan have made spectacular progress on the code. We have produced a system in house which we use regularly for downloading and analysing publicly visible and Open scholarly material. We have a build system which involves about 30 different projects and libraries and is quite complex. I should pay tribute to Jim Downing, Sam Adams, Nick Day, and others in our group for having set this up. I don't think we could have done it without the infrastructure that we have built and we'd be delighted to talk with are the people who are interested in managing large and varied amounts of scholarly information. What is really exciting is that Mark and Dan have understood what we're doing, have written some very nice documentation on the science online wiki, and have robustified the procedures. Because they are working on other systems, including other operating systems, this is an excellent test of portability. Quite shortly we should be able to create a package which many of you will be able to use in this project.
Heather Piwowar has made wonderful progress on the IsItOpenData resource. We shortly going to be creating template letters to send to those people who expose data on the web. Will be sending these to the owners of most of the sources. It may seem strange to enquire from an open access publisher whether their data is open, but this letter will give us a chance to thank them and also an opportunity for them to respond to the project. So if you're a publisher, or a site, which exposes open data then don't be surprised if you get a request asking IsItOpen. And thank you.
Mat Todd and Jean-Claude Bradley have already posted their notebooks and Lab reports. I'll be looking at these in detail later this morning I hope. Mat has also posted a list of potential sources of Open chemistry, and will be using these where it is clear that these are open. Some of those sources are not explicitly open and heather and I (and anyone else who wishes to help) out will be composing letters and sending them through IsItOpen. We hope that we will get a quick enough response to allow us to use these sources for the project, and if so this will be fantastic.
Because these volunteers have made such rapid progress much of this is in a state where you can join in. This is part of the purpose of the experiment, allowing anyone to get a feel for what data-rich science is like. So for example you will be able to install the package for downloading and analysing Open Data.
ONE CAVEAT – SOME OF THE TOOLS CARRY OUT AUTOMATIC DOWNLOADING. WHERE POSSIBLE WE WISH TO DO THIS ONLY ONCE TO AVOID DENIAL-OF-SERVICE AND MESSING UP ACCESS COUNTS. SO WE WOULD HOPE TO CACHE DATA AFTER IT HAS BEEN DOWNLOADED. ANY SUGGESTIONS ON HOW AND WHERE WE CAN DO THIS (AND OFFERS OF HELP) WOULD BE WELCOMED. NOTE THAT THE DATA WILL ALL BE EXPLICITLY OKD-COMPLIANT SO CANNOT BE SERVED FROM A LESS-THAN-OPEN RESPOSITORY
And, yes, I goofed on the tag. It should be #solo10