It is becoming critical that we (?everyone) defines what is meant by "open API" and what it means operationally. This post introduces the problem – the next will suggest some ways forward.
Why does this matter? Isn't "open" an indication of goodwill towards others? A general philosophy that we'd like to share things and work together? That things should be free?
The problem is that "open" is used in so many contexts, often without thought, that it can become almost meaningless. And if you take it lightly it will cost you money or end you in court.
What does "open Access" mean? I am sure all readers know.
Except they don't. If you are asked to pay 3000 USD as an author of an Open Access scholarly article, what are you getting? And what are you offering to the rest of the world. Often it is seriously unclear. Why pay 3000 USD if you can post your article as "Green Open Access"? Are you allowed to post your article? Can you re-use it?
In fact don't you have to read the small print of every single publisher (if you can find it, which usually I cannot)? And make sure that what you do isn't going to end you up in court? Yes, I'm serious. If you post a single image from a Wiley journal you are still in danger of being sued or having your subscription cut off (http://scienceblogs.com/retrospectacle/2007/04/when_fair_use_isnt_fair_1.php ). Claiming that you had some nebulous "open" right or "fair use" isn't going to remove the lawyers. Wiley still require you to ask permission to re-use "their" material (even if you wrote it or drew the pictures).
In short I believe "Open" is only useful as an operational term if it is clearly defined as something that frees us from the threat of lawyers.
Many people use "open" like Humpty Dumpty uses "glory"
"There's glory for you!"
"I don't know what you mean by 'glory,' " Alice said.
Humpty Dumpty smiled contemptuously. "Of course you don't—till I tell you. I meant 'there's a nice knock-down argument for you!' "
"But 'glory' doesn't mean 'a nice knock-down argument,' " Alice objected.
"When I use a word," Humpty Dumpty said, in rather a scornful tone, "it means just what I choose it to mean—neither more nor less."
Here's a conversation I had with a vendor of information systems about 2 years ago at a JISC meeting:
V: "We have an Open API" (implying this was a GOOD THING)
Me: "can you let me have a copy of the spec?"
V: "No, it's confidential to customers"
Me: "If I purchased your system could I share the API with others?"
V: "No, that's a breach of contract" (i.e. I might be sued).
I have had this conversation with other vendors. When I questioned them I was told that I had a different idea of Open from them. (True). This use of "open" seems to be as useful as "healthy". The most charitable interpretation is that they have actually documented their API. "open" is frequently a marketing word, or a word to make you feel good (about the "open"ers) or just fuzz to show the heart is in the right place.
And as such I shall replace it by Humpty's word, "glorious".
V: We have a glorious API.
Me: no quibble. Meaningless marketspeak but I'm used to that
So whenever you hear "open", substitute "glorious" and see if you have lost any information.
Open Source does this well. I know that Sourceforge, Outercurve, Apache, Bitbucket, Git contain Open Source programs. And if I look this up on OSI I find: http://www.opensource.org/docs/osd
Open source doesn't just mean access to the source code. The distribution terms of open-source software must comply with the following criteria. They are simple (fit on a page) and crystal clear to English speakers (and have of course been translated). I'm just giving the headings here, but READ them.
1. Free Redistribution
2. Source Code The program must include source code, and must allow distribution in source code
3. Derived Works The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software.
4. Integrity of The Author's Source Code
5. No Discrimination Against Persons or Groups
6. No Discrimination Against Fields of Endeavor
7. Distribution of License The rights attached to the program must apply to all to whom the program is redistributed
8. License Must Not Be Specific to a Product
9. License Must Not Restrict Other Software
10. License Must Be Technology-Neutral
This doesn't stop you running a business on Open Source (Redhat, Kitware ++). Or having moral control – as long as you can exercise it through e-charisma. But, in principle and usually in practice, anyone has the right to copy and fork your code. It may be frowned upon, but it will not bring the lawyers.
Whereas if you fork copyright material – even if "freely" available on the web, and even if not created by the copyright holder, lock the door or leave the country. (The original idea that academics signed over their copyright to publishers so that publishers could protect academics from pirates seems tragically distant now. Publishers "own" OUR material for their own ends).
So the Open Access declaration (I use Budapest http://www.soros.org/openaccess/read ) had the same noble principles:
By "open access" to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.
These are great principles, and COULD have been crafted into a legal framework that ensured that readers could re-use Open material without fear. But in practice the community did not address this with the result that until recently no one knew what "open access" meant in practice. If I post a "green open access" copy of a "publisher's article" and I re-use this for any purpose I can still be sued. There is no legal gift, no legal guarantee.
The major progress in this has been the emergence of "Open Access" publishers. These are – in the main – characterised by using CC-BY licences. A document which EXPLICITLY gives the reader/user rights. With Open Access publishers you can sleep soundly.
Note that if a document is not completely Open its status is effectively closed in legal terms. This is not a quibble – ask the lawyers when they come after you.
Sadly Institutional Repositories have almost completely failed in promoting Open Access. Almost no content carries explicit rights, and without those rights you can only assume that the content is closed. If you doubt this, try to find more than 5% of any IR which is explicitly marked as Open/CC-BY. And how did you serach for it? By hand – as repositories generally don't provide search-by –legal-rights. So almost all content in IRs is "glorious".
The Open Knowledge Foundation has defined "Open Knowledge" very clearly (http://www.opendefinition.org/ ):
"A piece of content or data is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and share-alike.".
AND the OKF has spent time in cataloguing those licences which are OKD-compliant and those which are not. For example CC-BY is, CC-NC is not, compliant. OSI licences are compliant. A document without a licence is not, de facto, compliant. So if something is OKD-compliant, you can sleep. Otherwise you can't.
All of this leads to a recent discussion (http://lists.okfn.org/pipermail/open-bibliography/2011-September/thread.html ) on OKF's Open-bibliography list about the use of "Open" (http://lists.okfn.org/pipermail/open-bibliography/2011-September/001141.html).
David Weinberger <self at evident.com> wrote:
> LibraryCloud is a metadata server gathering
> library metadata (circ data, user reviews, etc.) and making it openly
> available via APIs and Linked Open Data.
> We are on the verge of making it accessible to a limited public (API key
> required, daily queries limited to 3,152). We're interested in
> contributing what we can as we can. (No, we cannot make its catalog
> available in its entirety. We wish.)
These two paragraphs contradict each other.
Is LibraryCloud an open data provider, or not?
Either data is open, and it is possible to get hold of the entire
dataset with a clear open license for what you can so with it. Or it is not open.
It is wrong to call data open if it is subject to arbitrary access restrictions like 3K entries a day.
A lively and fruitful discussion followed, with some supporting "open" == OKD-compliant and other arguing that "open" was an arbitrary point on a spectrum. For which I read "glorious". Here's two typical passages:
isn't "open" according to [OKD] strict standards it isn't open at all. This
completely misses the fact that "open" as in the Harvard API may be
completely fine and useful for nearly all real world purposes.
PMR: The good intention is clear, but "open" gives no other information. It does not keep the lawyers away.
we cannot make all our catalog
data available for bulk download. That is a limitation we all regret
but there it is. I would argue that because the data we make
available we make available without restriction, it is reasonable to
use "open" as a modifier.
PMR: This shows the complexity. Perhaps the individual items *are* Open. In which case good, and in which case give them each a licence.
The sad fact is that in many cases "open" == "glorious".
If we are to operate legally usefully then the only practicable way is to use the Open Definition. Everything else may be interesting points on a political spectrum, but only OKD make us safe.
And brings in the promised land of infinite re-use of knowledge.
There *are* technical concerns with OKD-Open APIs and I'll discuss them in the next post