[Quote in title is from Mark Hahnel, see below]
I have been meaning to write on this theme for some time, and more generally on the increasing influence of DigitalScience's growing influence in parts of the academic infrastructure. This post is sparked by a twitter exchange (follow backwards from https://twitter.com/petermurrayrust/status/591197043579813888 ) in the last few hours, which addresses the question of whether "Figshare is Open".
This is not an easy question and I will try to be objective. First let me say - as I have said in public - that I have huge respect and admiration for how Mark Hahnel created Figshare while a PhD student. It's a great idea and I am delighted - in the abstract - that it gained some much traction so rapidly.
Mark and I have discussed issues of Figshare on more than one occasion and he's done me the honour of creating a "Peter Murray-Rust" slide (http://www.slideshare.net/repofringe/figshare-repository-fringe-2013 ) where he addresses some (but not all) of my concerns about Figshare after its "acquisition" by Macmillan Digital Science (I use this term, although there are rumours of a demerger or merger). I use "acquisition" because I have no knowledge of the formal position of Figshare as a legal entity (I assume it *is* one? Figshare FAQs ) and that's one of the questions to be addressed here.
From the FAQs:
figshare is an independent body that receives support from Digital Science. "Digital Science's relationship with figshare represents the first of its kind in the company's history: a community based, open science project that will retain its autonomy whilst receiving support from the division."
However http://www.digital-science.com/products/ lists Figshare among "our products" and brands it as if it is a DigitalScience division or company. Figshare appears to have no corporate address other than Macmillan and I assume trades through them.
So this post has been catalysed by a tweet of a report from a DS employee(?) Dan Valen
John Hammersley @DrHammersley tweeted:
Such a key message: "APIs are essential (for #opendata and #openscience)" - Dan Valen of @figshare at #shakingitup15 pic.twitter.com/HDyYEaXJRn
This generated a twitter exchange about why APIs were/not essential. I shan't explore that in detail, but my primary point is that:
If the only access to data is through a controlled API, then the data as a a whole cannot be open , regardless of the openness of individual components.
There is no doubt that some traditional publishers see APIs as a way of enforcing control over the user community. Readers will remember that I had a robust discussion with Gemma Hirsh of Elsevier, who stated that I could not legally mine Elsevier's data without going through their API. She was wrong, categorically wrong, but it was clear that she and Elsevier saw, and probably still see, APIs as a control mechanism. Note that Elsevier's Mendeley never exposed their whole data - only an API.
An API is the software contract with a webserver offering a defined service. It is often accompanied with a legal contract for the user (with some reciprocity). The definition of that service is completely in the hands of the provider. The control of that service is entirely in the hands of the provider. This leads to the following technical possibilities:
- control: The provider can decide what to offer , when, to whom, on what basis. They can vary this by date, geography or IP of user, and I have no doubt that many publishers do exactly this. In particular, there is no guarantee that the user is able to see the whole data and no guarantee that it is not modified in some way from the "original". This is not, per se, reprehensible but it is a strong technical likelihood.
- monitoring: ("snooping") The provider can monitor all traffic coming in from IP addresses, dwell times, number of revisits, quite apart from any cached information. I believe that a smart webserver, when coupled to other data about individuals, can deduce who the user is, where they are calling from and, with the sale of information between companies, what they have been doing elsewhere.
By default companies will do both of these. They could lead to increased revenue (e.g. Figshare could sell user data to other organizations) and increased lockin of users. Because Figshare is one of several Digital Science products (DS words, not mine) they could know about a user's publication record, their altmetric activity, what manuscripts they are writing, what they have submitted to the REF, what they are reading in their browser, etc. I am not asserting this is happening but I have no evidence it is not.
Mark says, in his slides,
"it is not just about open or closed, it is about control"
and I agree. But for me the question is who controls Figshare? and is Figshare controlling us?
Figshare appears to be one of the less transparent organizations I have encountered. I cannot find a corporate structure, and the companies' address is:
C/o Macmillan Publishers Limited, Brunel Road, Basingstoke, Hampshire, RG21 6XS
I can't find a board of directors or any advisory or governing board. So in practice Figshare is legally responsible to no-one other than UK corporate law.
You may think I am being unfair to an excellent (and I agree it's excellent) service. But history inexorably shows that these beginnings become closed, mutating into commercial control and confidentiality. Let's say Mark moves on? Who runs Figshare then? Or Springer buys Digital Science? What contract has Mark signed with DS? Maybe it binds Figshare to being completely run by the purchaser?
I have additional concerns about the growing influence of DigitalScience products, especially such as ReadCube, which amplify the potential for "snoop and control" - I'll leave those to another blogpost.
Mark has been good enough to answer some of my original concerns, so here are some othe'r to which I think an "open" ("community-based") organization should be able to provide answers.
- who owns Figshare?
- who runs Figshare?
- Is there any governance process from outside Macmillan/DS? An advisory board?
- How tightly bound is Figshare into Macmillan/DS? Could Figshare walk away tomorrow?
- What could and what would happen to Figshare if Mark Hahnel left?
- What could and what would happen to Figshare if either/both of Macmillan / DS were acquired?
- Where are the company accounts for the last trading year?
- how, in practice, is Figshare a "a community based, open science project that will retain its autonomy whilst receiving support from the (DS) division."?
I very much hope that the answers will allay any concerns I may have had.