"open access" is not good enough

I have ranted at regular intervals about the use of “Open Access” or often “open access” as a term implying more than it delivers. My current concern is that although there are are tens of thousands of theses described as “open access” I have only discovered 3 (and possibly another 15 today) which actually comply with the BOAI definition of Open Access.
The key point is is that unless a thesis (or any publication) explicitly carries a license (or possibly a site meta-license) actually stating that it is BOAI compliant, then I cannot re-use it. I shall use “OpenAccess” to denote BOAI-compliant in this post and “open access” to mean some undefined access which may only allow humans to read but not re-use the information
I do not wish to disparage the important efforts to making scholarly information more widely available, and I applaud the general direction and achievement of the groups below. I appreciate that the copyright of historical content normally is held by the student author and it’s certainly very valuable to have “access” to it. But it is not OpenAccess. And unless specific policies are put in place to add specific BOAI-compliant licenses then future theses will also be non-compliant.
Here are typical statements:

  • “EThOS will make UK theses available on open access for global use”. Having spoken to EThOS colleagues last week it is clear that “open access” does not automatically mean OpenAccess.  (Electronic theses in the UK: the open access future : JISC).
  • MIT theses: “Regardless of whether copyright is held by the student or the Institute, the MIT Libraries publish the thesis electronically allowing open access viewing and limited downloading/printing. See http://dspace.mit.edu.” The term “open access viewing” might suggest the theses are BOAI-compliant and therefore is potentially misleading. I found that the “public” thesis had been mounted with “printing disabled” which means that it cannot be technically re-used (as well as being legally non-reusable)
  • ECS Soton: A well-known thesis is in e-form: (ECS EPrints Service – Evaluating Research Impact through Open Access to Scholarly Communication) (Brody, T). It offers two sorts of download: “(i) PDF – and (ii) Other (Latex)Access restricted to members of ECS [i.e. Soton only]”. This is a differential distribution of scholarship. I could also not find any license or copyright statement

By contrast let’s look at “Open Source” which applies to software and has been highly successful in liberating the field. It’s very widely used in academia and elsewhere. The Open Source Definition states

Open source doesn’t just mean access to the source code. [PMR’s emphasis] The
distribution terms of open-source software must comply with
the following criteria [PMR’s elisions]:
1. Free Redistribution
The license shall not restrict any party from selling or
giving away the software as a component of an aggregate
software distribution containing programs from several
different sources. The license shall not require a
royalty or other fee for such sale.
2. Source Code
The program must include source code […]
3. Derived Works
The license must allow modifications and derived works, and must
allow them to be distributed under the same terms as the license
of the original software.
4. Integrity of The Author’s Source Code
The license may restrict source-code from being distributed in
modified form only if the license allows […]
7. Distribution of License
The rights attached to the program must apply to all to whom
the program is redistributed without the need for execution of
an additional license by those parties.
*10. License Must Be Technology-NeutralNo provision of the license may be predicated on any individual
technology or style of interface.

In general the term “Open Source” is completely self-explanatory within a large community. I can describe my software as OS and everyone understands what I mean. There are some licenses (e.g. GPL) which require additional freedoms but they don’t invalidate the above.
By contrast if someone describes something as “open access” it simply means that I may – as a human – and at some arbitrary time in human history – read the document. It does not guarantee that

  • I can save my own copy (the MIT suggests you cannot print it and may not be allowed to save it)
  • that it will be available next week
  • that it will be unaltered in the future or that versions will be tracked
  • that I can create derivative works
  • that I can use machines to text- or data-mine it

So I believe that “open access” should be recast as “toll-free” – i.e. you do not have to pay for it but there are no other guarantees. We should restrict the use of “Open Access” to documents which explicitly carry licenses compliant with BOAI. [A weaker (and much more fragile approach) is that a site license applies to all content. The problem here is that documents then get decoupled from the site and their OpenAccess position is unknown.]
If the community wishes to continue to use “open access” to describe documents which do not comply with BOAI then I suggest the use of suffixes/qualifiers to clarify. For example:

  • “open access (CC-BY)” – explicitly carries CC-BY license
  • “open access (BOAI)” – author/site wishes to assert BOAI-nature of document(s) without specific license
  • “open access (FUZZY)” – fuzzy licence (or more commonly absence of licence) for document or site without any guarantee of anything other than human visibility at current time. Note that “Green” open access falls into this category. It might even be that we replace the word FUZZY by GREEN, though the first is more descriptive.

However there is no value in “Green open access” for theses. Let’s make sure they are all BOAI compliant.

This entry was posted in etd2007, open issues. Bookmark the permalink.

7 Responses to "open access" is not good enough

  1. Pingback: “open access” is not good enough | Talk Utopia

  2. Bill says:

    I absolutely agree with you that Free (to human eyeballs, one pair at a time) does not equal Open. I have said publicly that, so far as I can tell, there are currently no — or very, very few — existing, functioning Open Access repositories.
    As I’m sure you’re aware, though, some prominent OA advocates do not agree, and insist that there is no access barrier in the absence of explicit machine-readability provisions. (See, e.g., comments on the post I linked.)
    I do not quite know what to do about this. The insistence on Green OA with no thought for BOAI-level OA seems to me to be dangerously short-sighted. At the very least it will lead to an immense Backlog Problem in the future, when the community realizes: “oh bugger, PMR was right and we need machine-readability, now what do we do with all these Green archives?”.

  3. pm286 says:

    (2) Thanks for all you support Bill. I think we just have to bang on about this until people get the point. I suspect that many of the people managing “open repositories” have no idea how scientitsts work. They are used to interlibrary loans that take weeks or writing off for permissions. They don’t understand that our OSCAR software can process 1 million abstracts a day (unless it has to ask for permission when it can do 1 million abstracts in 1 million days).
    I shall be evangelising this at Uppsala next week in ETD2007. I expect to get through to a few delegates. In fact it is one of the last real chances we have to assert academic independence.

  4. Open Access: What Comes With the Territory
    Peter Murray-Rust’s worries about OA are groundless. Peter worries he can’t be be sure that:

    “I can save my own copy (the MIT [site] suggests you cannot print it and may not be allowed to save it)”

    Pay no attention. Download, print, save and crunch (just as you could have done if you had keyed in the text from reading the pages of a paper book)! [Free Access vs. Open Access (Dec 2003)]

    “that it will be available next week”

    It will. The University OA IRs all see to that. That’s why they’re making it OA. [Proposed update of BOAI definition of OA: Immediate and Permanent (Mar 2005)]

    “that it will be unaltered in the future or that versions will be tracked”

    Versions are tracked by the IR software, and updated versions are tagged as such. Versions can even be DIFFed.

    “that I can create derivative works”

    You may not create derivative works. We are talking about someone’s own writing, not an audio for remix, And that is as it should be. The contents (meaning) are yours to data-mine and reuse, with attribution. The words, however, are the author’s (apart from attributed fair-use quotes). Link to them if you need to re-use them verbatim (or ask for permission).

    “that I can use machines to text- or data-mine it”

    Yes, you can. Download and crunch away.
    This is all common sense, and all comes with the OA territory when the author makes his full-text freely accessible for all, online. The rest seems to be based on some conflation between (1) the text of research articles and (2a) the raw research data on which the text is based, and with (2b) software, and with (2c) multimedia — all the wrong stuff and irrelevant to OA).
    Stevan Harnad
    American Scientist Open Access Forum

  5. Bill says:

    Stevan Harnad chimes in here.

  6. Les Carr says:

    Peter, you seem to have arbitrarily (?) imposed a self-restriction on your work: “unless a thesis (or any publication) explicitly carries a license (or possibly a site meta-license) actually stating that it is BOAI compliant, then I cannot re-use it”. Why did you do that? More to the point, why hasn’t it stopped you before and why hasn’t it stopped any other OAI or datamining service?

  7. pm286 says:

    (6) Les, I have answered this in a reply to Stevan. It is not ME who has imposed a restriction – it is the repository managers I talk to who make it clear that nothing can be done without the permission of the authors.
    LC: More to the point, why hasn’t it stopped you before and why hasn’t it stopped any other OAI or datamining service?
    It hasn’t stopped me before because I haven’t had any access to data so I haven’t done anything. Now that the content is available I thought I could use all these theses but was told I couldn’t.
    If you know of services that mine the text (not the metadata) of thesis collections I would be very pleased to know of them.

Leave a Reply

Your email address will not be published. Required fields are marked *