I’m currently at a meeting on Computational Chemistry where we are looking at how to store, search and disseminate our results. http://neptuno.unizar.es/events/qcdatabases2010/program.html The problem is a very general one:
A community creates results and wants to make the raw results available under Open licence on the web. The results don’t all have to be in the same place. Value can be added later.
One solution is to publish this as supplemental data for publications. (The crystallographers require this and it’s worked for 30 years). But the Comp chem. People have somewhat larger results – perhaps 1-100 TB /year. And they don’t want the hassle (particularly in the US) or hosting it themselves because they are worried about security (being hacked).
So where can we find a few terabytes of storage. Can university repositories provide this? Would they host data from other univs? Could domain-specific repositories (e.g. Tranche, Dryad) manage this scale of data?
Last time I asked for help on this blog I got no replies and we had to build our own virtual machine and run a webserver. We shouldn’t have to do this. Surely there is a general academic solution – or do we have to buy resources from Amazon. If so how much does it cost per TB-year?
If we can solve this simple problem then we can make rapid advance in Comp Chem.
Simple web pages, no repository, no RDB, no nothing.
Paul Miller has tweeted a really exciting possibility:
At first sight this looks very much what we want. It’s public, draws the community together, it’s Open. Any downside?