Deepak Singh highlights one of the emerging approaches to global data, Freebase. Recall that at scifoo we also heard about Google’s offer to host scientific data:
2 Responses to “Freebase at Scifoo”
- 1 Aug 12th, 2007 at 4:56 pm
A lot of the Semantic Web vision is based on exactly what you are asking for – something like MetaWeb, but open and distributed – like the difference between a great ebook and the Web – each has its place, but the place for an open distributed store as a way of linking things seems to be important — check out the W3C’s Semantic Web Activity (http://www.w3.org/2001/sw)
I am attracted by Freebase/Metaweb and also DBPedia/openlink. These are technologies which build ontological-supported repositories where large amounts of metadata can be centrally stored. I talked with some of the people involved at the www2007 meeting and some of the have the vision of vast central stores of metadata – loosely tera-triplestores or larger. I think that technology now allows this.
However I also picked up this centralist approach. There was also a view that the whole of the world’s information could be given unique IDs. This won’t work generally as there are many concepts which are important but too fuzzy to label. Copies, containers, addresses, versions etc. all cause major problems.
And I think Deepak is right for bioscience – it can’t be centralised and the semantic web has to be distributed.
But chemistry is smaller. I have already suggested that a year’s core information on new published compounds could be squeezed into a few terabytes. Not everything, perhaps, but enough to make it worthwhile. And, in chemistry, most concepts can be given unique labels. So, as always, it’d discipline-dependent.
Did I mention that such a repository has to be completely Open Data?
The differences between fields was evident as SciFoo. It appears that some fields like cosmology require a great deal of time to analyze raw data, while in organic synthesis, once we are able to visualize the raw data in the form of spectra we can quickly come to a decision about the support for the claim that a certain compound was isolated. My guess is that what is expected of the peer review system will vay greatly between fields as well.