"open access" – some central questions

I am grateful for the recent correspondence from Peter Suber and Stevan Harnad as it helps me get my thoughts in order for ETD2007. In response to Stevan:
Open Access: What Comes With the Territory,
Peter has analysed the central question very clearly (as always)

I expect that all of us will agree with the analysis below. The position each of us takes may vary:

Summary [of Stevan’s post]:Downloading, printing, saving and data-crunching come with the territory if you make your paper freely accessible online (Open Access). You may not, however, create derivative works out of the words of that text. It is the author’s own writing, not an audio for remix. And that is as it should be. Its contents (meaning) are yours to data-mine and reuse, with attribution. The words themselves, however, are the author’s (apart from attributed fair-use quotes). The frequent misunderstanding that what comes with the OA territory is somehow not enough seems to be based on conflating (1) the text of research articles with (2a) the raw research data on which the text is based, or with (2b) software, or with (2c) multimedia — all the wrong stuff and irrelevant to OA.


  • Stevan is responding to Peter Murray-Rust’s blog post from June 10. But since I agreed with most of what Peter MR wrote, I’ll jump in.
  • Stevan isn’t saying that OA doesn’t or shouldn’t remove permission barriers. He’s saying that removing price barriers (making work accessible online free of charge) already does most or all of the work of removing permission barriers and therefore that no extra steps are needed.
  • The chief problem with this view is the law. If a work is online without a special license or permission statement, then either it stands or appears to stand under an all-rights-reserved copyright. The only assured rights for users are those collected under fair use or fair dealing. These rights are far fewer and less adequate than OA contemplates, and in any case the boundaries of fair use and fair dealing are vague and contestable.
  • This legal problem leads to a practical problem: conscientious users will feel obliged to err on the side of asking permission and sometimes even paying permission fees (hurdles that OA is designed to remove) or to err on the side of non-use (further damaging research and scholarship). Either that, or conscientious users will feel pressure to become less conscientious. This may be happening, but it cannot be a strategy for a movement which claims that its central practices are lawful.
  • This doesn’t mean that articles in OA repositories without special licenses or permission statements may not be read or used. It means that users have access free of charge (a significant breakthrough) but are limited to fair use.

PMR: “The chief problem with this view is the law”. That puts it precisely, and that’s where Stevan and I differ. At the moment I think we have to work within the law, and I think the law debars me from crunching. There may come a time where we feel that civil disobedience is unavoidable but it hasn’t arrived yet – if it does I shall be there.
And some comments on other parts of Stevan’s post:

Get the Institutional Repository Managers Out of the Decision Loop

The trouble with many Institutional Repositories (IRs) (besides the fact that they don’t have a deposit mandate) is that they are not run by researchers but by “permissions professionals,” accustomed to being mired in institutional author IP protection issues and institutional library 3rd-party usage rights rather than institutional author research give-aways.

PMR: I have had similar thoughts. I got the distinct impression that some IR’s are run like victorian museums – look but don’t touch. Ithe very word “repository” suggests a funereal process – it’s no surprise that having put much of my stuff into DSpace I find it’s an enormous effort to get it out. Why don’t we build “disseminatories” instead?
[Stevan’s analysis of how we should deposit papers omitted. I don’t disagree – I’m just more interested in data t present.]

Now, Peter, I counsel patience! You will immediately reply: “But my robots cannot crunch Closed Access texts: I need to intervene manually!” True, but that problem will only be temporary, and you must not forget the far larger problem that precedes it, which is that 85% of papers are not yet being deposited at all, either as Open Access or Closed Access. That is the inertial practice that needs to be changed, globally, once and for all.

PMR: Here we differ. In many fields there has been little movement and no Green journals. We could wait another five years for no effect. But my main concern is the balance between Green access and copyrighted data. The longer we fail to address the copyrighting of data the worse the situation will become. Publishers are not stupid – they have revenue-oriented business people working out how to make money out of our data – Wiley told me so. Imagine, for example, that a publisher says “I will make all our journals green as long as we retain copyright. And we’ll extend the paper to cover the whole of the scientific record”. That would be wonderful for Stevan and a complete disaster for paper-crunchers. We can’t afford to wait for that to happen.

TJust as I have urged that Gold OA (publishing) advocates should not over-reach (”Gold Fever“) — by pushing directly for the conversion of all publishers and authors to Gold OA, and criticizing and even opposing Green OA and Green OA mandates as “not enough” — I urge the advocates of automatized robotic data-mining to be patient and help rather than hinder Green OA and Green OA (and ID/OA) mandates.

PMR: I am not – I hope – hindering Green access. I am not personally agitating for Green or Gold – my energies go into arguing that the experimental process must not be copyrighted by the publisher or anyone else. And that institutional repositories should start to be much much more proactive and actively support the digital research process.

One Response to "open access" – some central questions

  1. On Patience, and Letting (Human) Nature Take Its Course

    Peter Murray-Rust (P-MR) writes: “I don’t disagree… [with] Stevan’s analysis of how we should deposit papers… I’m just more interested in data at present…
    “Imagine, for example, that a publisher says ‘I will make all our journals green as long as we retain copyright. And we’ll extend the paper to cover the whole of the scientific record’. That would be wonderful for Stevan and a complete disaster for paper-crunchers.”

    Make no mistake about it: Peter Murray-Rust (and Peter Suber) and I are all in total agreement about the goals, and in near-total agreement about the means.
    PMR is especially concerned about research data harvesting and mining, which is not, strictly speaking, an OA matter, for two reasons:
    (1) OA’s primary target is research article texts. (That doesn’t matter: free online access to data is extremely important too, and is part of OA’s wider target.)
    (2) More important, access to article texts is actually — or, as I suspect, perceptually — constrained by publishers’ copyright-based restrictions. That is not true of data.
    So, to a first approximation, authors are perfectly free to make their data OA today if they wish; all they need do is adopt the right Creative Commons License for it and then self-archive it. If they don’t make their data OA, it’s their own fault, not the fault of publisher restrictions, actual or perceived.
    PMR is worried that authors, instead of self-archiving their data, will instead transfer copyright for their data to their publishers, in exchange for their publishers adopting a Green policy. But I think PMR is misunderstanding a Green publisher policy here! Green publishers don’t make their published matter OA; they merely bless the author‘s making it OA, if he wishes, by self-archiving it. The only publishers that make their own published matter OA are Gold OA publishers.
    So what is the motivation for the copyright scenario PMR is worried about? Authors, who today cannot be bothered to self-archive their own data at all, and cannot be bothered to self-archive their articles either (and/or are too bothered by actual or perceived publisher’s restrictions to do so) will henceforth, according to this scenario, adopt the brand-new practice of transferring copyright for their data (along with their articles) — in exchange for their publishers going Green!
    But why on earth would authors do that? What is the motivation? They can’t be bothered self-archiving their data today, when they don’t need their publisher’s blessing (or greenery) to do it, just as most of them can’t be bothered to self-archive their articles, even when they have their Green publishers’ (62%) blessing to do so. Yet, for some unknown reason, these passive authors are to be imagined (in PMR’s scenario) as being ready to transfer copyright for their undeposited data to their publishers, in exchange for their publishers’ agreeing to give them the green light to self-archive their data (and articles)!
    I think this fantasized scenario misses the point completely, and that point is precisely the one that PMR confesses he is less interested in, namely, that what is needed to get these passive authors to do the right thing — in their own interests, but also in the interests of their institutions, their funders, the public that funds their funders and in whose interests the research is done, and in the interests of research progress and productivity itself — is a Green OA self-archiving mandate, adopted by their institutions and funders! A mandate that requires them to self-archive, as a condition of employment and funding.
    I would be quite happy if that self-archiving mandate applied to their data as well as to their articles. But first things first. A mandate first needs to be successfully adopted. And authors are already publishing their articles, but not yet publishing their data. Some may not wish to publish their data (preferring to keep it under wraps so that they, and not their competitors, can mine it); I make no judgment about this, except that co-bundling an article-archiving mandate with a data-archiving mandate would put the successful adoption of any mandate at all at risk, because of these potential exceptions and oppositions. (It is for similar reasons that a mandate to self-archive the refereed, accespted, published postprint is unproblematic, whereas a mandate to also self-archive the unrefereed preprint would be: Not all authors are willing to make their preprints public, nor should they be required to be. But all authors publish their postprints, by definition.)
    So the prospects for the successful adoption of a postprint mandate are far better than the prospects for the successful adoption of a either postprint+preprint mandate or a postprint+data mandate. The Immediate-Deposit/Optional-Access (ID/OA) mandate in particular, as repeatedly noted, is the one with the best chance of successful adoption: It moots publisher restrictions, because it only requires deposit, not immediate OA-setting; yet it has the “Fair Use Button” to tide over usage needs during any embargo period. And ID/OA is not weighed down by requiring either preprint-deposit or data-deposit (or copyright-retention): It merely recommends them, just as it merely recommends setting access to the deposit as OA rather than Closed Access.
    But — if we agree that the only thing standing between us and 100% OA (not only for articles, but for data too) is those deposit keystrokes that sluggish, passive authors simply are not doing, unmandated — then it should also be apparent why ID/OA is exactly what is needed now to get those keystrokes done. ID/OA does not go the whole way: It does not require the Nth (OA) keystroke. But unless we are all deeply deluded about the benefits of OA, OA’s own rewards will see to it that those Nth keys get stroked, once the ID/OA mandate has propagated across all of research space, and human nature takes its course. The OA usage/impact advantage, which today can only be demonstrated by painstaking, post-hoc analyses (invariably discounted by the publishing lobby’s “Dream Team,” committed to arguing that there is no real advantage to OA!), will instead be obvious from the download and citation statistics for Open Access versus Closed Access articles in every Institutional Repository (IR); and the difference will be reinforced by the deluge of email eprint requests generated by the IR software’s “Fair Use Button.”
    But once those Nth keystrokes fall, the token will (by the same token!) also fall for those same authors (i.e., all authors!), realizing the potential benefits of depositing their data too. OA will naturally propagate from postprints to (many) preprints and (most) underlying data too.
    That is why I urge patience, and making common cause with Green OA mandates, for those whose goal is OA data-archiving: that too will come with the territory.
    And there is no way in the world that authors will instead opt, for no reason at all, to transfer copyright to their publishers for their data too, along with copyright for their texts, in exchange for their publishers giving them the green light to do the self-archiving that they are not bothering to do anyway, with or without a green light!
    They might agree to transfer data rights to a Gold OA publisher. But that would be no problem, because Gold OA publishers really do make their articles (and hence also their data) accessible online in every way, including for robot harvesting and data-mining. With ID/OA mandates, the next step after 100% postprint deposits (62% OA and 38% Closed Access + semi-automatic Fair-Use eprints) will be the transition to 100% Green OA for all postprints (the Nth keystroke), and then to the depositing of the accompanying data, with rights specified by the CC license the author adopts.
    That’s the natural scenario, and all it needs right now is worldwide propagation of the ID/OA mandate. To achieve that, we must not chafe, for the time being, at the absence of a guarantee of robotic harvesting and mining (for either text or data), because insisting on that now can only blunt the motivation and slow the momentum for the universal adoption of the ID/OA mandate.
    Let us be patient, get the mandates adopted, and let them do their inexorable work; then the era of 100% OA — for both text and data — will not be far behind. You can (data-)bank on that!
    Stevan Harnad
    American Scientist Open Access Forum

