Mendeley, Scopus, Talis – will you be making your data Open?

Scraped/typed into Arcturus

Ian Mulvany (VP New Product Development, Mendeley)

Has blogged about the excitement of connecting scientific data: (http://directedgraph.net/2010/08/27/connecting-scientific-data/ )


I’m going to be hosting a session at science online London next weekend, I’m excited. I’ve been interested in the issues of connecting scientific data for a long time. In the last six months I’ve become particularly excited about the potential of web based tool like Yahoo Query Language. I was hoping to talk a little about that, but I’ve been lucky to get some amazing people to come and share their experiences about linking data, so I’m going to cede the floor to them. I might be able to get some YQL hackery into one of the unconference slots that will be knocking around. Science online is shaping up to be a pretty awesome event, and you can check out the conference program to see what you will be missing out on!

Here is the spiel and speaker bios for the section that I’m going to be running:

Connecting Scientific Resources

Do you have data? Have you decided that you want to publish that data in a friendly way? Then this session is for you. Allowing your data to be linked to other data sets is an obvious way to make your data more useful, and to contribute back to the data community that you are a part of, but the mechanics of how you do that is not always so clear cut. This session will discuss just that. With experts from the publishing world, the liked data community, and scientific data services, this is a unique opportunity to get an insight into how to create linked scientific data, and what you can do with it once you have created it.

The other speakers are:

Michael Habib, Product Manager, Scopus UX + Workflow

Richard Wallis, Technology Evangelist Talis

Chris Taylor, Senior Software Engineer for Proteomics Service, EBI

This looks like a very exciting session, and I’ll be going. Linking scientific data will transform the way we do science – Tony Hey and colleagues haver published “The Fourth Paradigm” about data-driven science (some people call it discovery science), where the analysis of the data, especially from many fields, comes up with radical new insights.

There’s only one requirement.

The data must be Open. (ideally it should be semantic as well). Open as in libre. Free as in speech not as in beer. Compliant with the Open Knowledge Definition (http://www.opendefinition.org/ ). And if the data are provided through a service, that must also be Open (http://www.opendefinition.org/ossd/ ).

This is not a religious stance, it’s that pragmatically Open is the only way that linked science data will work. It must be possible to access complete datasets, not just a query interface. We must know that – in principle – we have the absolute right to download and re-purpose the data. “Freely available” is not good enough (people can be sued for re-using free services).

There are problems. Data management costs money even if the content is free. Traditionally this has been a user-pays model. But this cannot work in from Linked Open Science. The data have to be as free as the text and images in CC-BY articles. Freely re-usable. Freely resellable.

And it’s true that companies are often better at managing data than academic researchers. They can invest in people and processes that are not dedicated to the holy grail of getting citations.

But how can a company create an income stream from Open Scientific content?

That’s the a question for me for this decade. If we can solve it we can transform the world. If however the linked Open data are all going to be through paywalls, portals, query engines then we regress into the feudal information possession of the past. I hope the companies presentin this session can help solve this. It won’t be easy but it has to be done.

So I now ask Mendeley, Elsevier/Scopus, Talis:

Are your data Openly available for re-use?

I’ve asked this of Elsevier about a year ago, when they promoted “mash up everything” in a public session on text-mining for science. I asked if we could mine their journal content for DATA and make it Openly available. They informally said yes. Informal isn’t good enough in a world where lawyers abound and I’m still following this up. That’s why we’ve set up the OKF’s isItOpen service.

I hope enough publishers of information can see the scope for Open Knowledge.

Then the world really will change.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *