Mendeley API Binary Battle: IsItOpenData?

Mendeley have invited me to enter their “API Binary Battle” and win 10001 USSD.

http://dev.mendeley.com/api-binary-battle.

Here’s the blurb:

Build an application with our data, make science more open and win $10,001!

What’s it all about?

At Mendeley we love science. We also love tech. And we’ve built the world’s largest crowdsourced research database, with 70 million documents, usage statistics and reader demographics, social tags, and related research recommendations, all available under a Creative Commons license.

We want to see a world in which science is mashed up… with anything. So, we are really excited to announce the Mendeley API Binary Battle. For you, this means: Build an application with this data, make science more open, win $10,001!

This could be very exciting. If we can really have a database of bibliographic data for 70 million documents AND if that is truly Open then it’s a major step forward. Of course it depends on what the documents are and what the quality of the data is, but that’s minor.

My excitement will depend on the answer to the following question(s) which I asked Mendeley last year but haven’t got a reply:

What is the data?

And is it OKD-compliant?
http://www.opendefinition.org/
“A piece of content or data is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and share-alike.”.

The positive indications are the use of “open” several times (though if not defined “open” is as useful as “healthy”). The mention of “a Creative Commons licence” (though not all are OKD-compliant. The presence of John Wilbanks on the team.

My acid test is” Can I systematically download all of the data in the Mendeley data base, transform it to my own format, and re-use and redistribute it subject only to acknowledging that it came from Mendeley?

If so this is a valuable resource.

I should really be asking this using the OKF IsItOpenData service. So I will. And tell you what it is for and looks like.

 

 

 

 

 

 

 

This entry was posted in Uncategorized. Bookmark the permalink.

4 Responses to Mendeley API Binary Battle: IsItOpenData?

  1. Jason Hoyt says:

    Hi Peter,
    All very good and valid questions raised there. To answer you acid test: yes, you could download and mash up our data set in any way you see fit. It is currently covered by a CC-BY license.
    As you pointed out with the addition of John Wilbanks from Creative Commons to the API judging panel, we are very serious about making data accessible and agreeable to all parties.
    That said, currently there is no bulk data dump option available to all. That option is available to academic researchers who want to work closely with us. The current process of using API methods is the more appropriate tool for developers desiring to build various applications for this contest.
    We see the creation and usage of basic developer-friendly APIs as one of the key solutions to making science more open and more digestible by the general public. Large, raw data sets can serve a different purpose.
    For serious research, ie not consumer facing apps to make science more accessible, we currently have a data set suited for collaborative filtering algorithm development (http://dev.mendeley.com/datachallenge/). We are also working on a few other large data sets that would be suited to other types of algorithm development and general research.
    We will also be taking in feedback from all relevant stakeholders, including yourself, as we go forward in our agenda of making science more open.
    Best,
    Jason

    Jason Hoyt, PhD
    Chief Scientist and VP for R&D
    Mendeley

  2. Euan says:

    IANAL, but the data includes (or did last time I checked) many, many abstracts that definitely haven’t been licensed for use in this way, scraped from PDFs or online sources of metadata like PubMed.
    Though
    1) I’m sure most publishers look favourably on their article metadata being spread around as many ways as possible (I’m speaking personally, not for NPG my employer whom I do not represent in this matter in any way)
    2) Abstracts do exist in a kind of weird grey area where nobody is sure exactly what’s fair use and what isn’t, and some people seem to believe that they’re public domain
    … it *doesn’t* seem to me that this means that anybody can package them up with a bunch of homegrown content under the same CC-BY license and say that the whole thing is “open”.
    Obvious example: some publishers sell their abstracts and associated metadata to commercial literature databases. The current Mendeley API license implies to me that I could put together my own, identical datasets with the same content from that source and sell it for half the price, thus cutting the original publisher out of the loop. This makes me think that a blanket one line “all data is made available under CC-BY” is insufficient.
    At the very least the attribution for abstracts should be to the copyright holders – preferably the authors, otherwise the publisher – not Mendeley (Again, IANAL, I may be wrong. If somebody wants to tell me so I’ll be very happy).
    I’ve mentioned this issue on Twitter a couple of times and know that Mendeley are perfectly aware of it, but haven’t ever had a response and nothing has ever changed on the license page. Jason…? Just having somebody say “we’ve checked it out with our lawyers and it’s all fine” would be good to hear. If it is then I’m off to build my own abstract dataset to sell for $$$. 😉
    The Mendeley API is awesome and the intentions noble, but you can’t cut legal corners. It won’t do anybody good in the long run (at some point it’ll become a problem) and, at worst, could potentially land people who’ve used the API in legal trouble with the *actual* intellectual property owners.
    I’d like to see a little extra effort in living up to the “open data” claim by securing the relevant permissions from copyright holders, or clarifiying exactly what attribution should be used for what, or separating out abstracts to be delivered under a different license… whatever would work.
    Alternatively John Wilbanks saying “it’s not a problem because x” would work for me too.

  3. Pingback: Unilever Centre for Molecular Informatics, Cambridge - Mendeley (and other Bib Data): WHAT is Open? « petermr's blog

Leave a Reply

Your email address will not be published. Required fields are marked *