Stevan Harnad on "open access"

Stevan Harnad – a tireless evangelist of OA – has replied to my points. He has been consistent in arguing the logic below and I agree with the logic. The problem is that few people believe that this allows us to act as he suggests.
Stevan argues that current Green Open Access allows us to do all we wish with the exposed material without permission. However when I spoke to several repositories managers at the JISC meeting all were clear that I could not have permission to do this with their current content. I asked “can my robots download and mine the content in your current open access repository of theses?” – No. “Can you let me have come chemistry theses from your open access collection so I can data-mine them/” – No – you will have to ask the permission of each author individually. So Stevan’s views on what I can do iseem not to be – unfortunately – widely held.

  1. Stevan Harnad Says:
    June 12th, 2007 at 3:37 am eOpen Access: What Comes With the Territory
    Peter Murray-Rust’s worries about OA are groundless. Peter worries he can’t be be sure that:

    “I can save my own copy (the MIT [site] suggests you cannot print it and may not be allowed to save it)”

    Pay no attention. Download, print, save and crunch (just as you could have done if you had keyed in the text from reading the pages of a paper book)! [Free Access vs. Open Access (Dec 2003)]

    “that it will be available next week”

    It will. The University OA IRs all see to that. That’s why they’re making it OA. [Proposed update of BOAI definition of OA: Immediate and Permanent (Mar 2005)]

    “that it will be unaltered in the future or that versions will be tracked”

    Versions are tracked by the IR software, and updated versions are tagged as such. Versions can even be DIFFed.

    “that I can create derivative works”

    You may not create derivative works. We are talking about someone’s own writing, not an audio for remix, And that is as it should be. The contents (meaning) are yours to data-mine and reuse, with attribution. The words, however, are the author’s (apart from attributed fair-use quotes). Link to them if you need to re-use them verbatim (or ask for permission).

    “that I can use machines to text- or data-mine it”

    Yes, you can. Download and crunch away.
    This is all common sense, and all comes with the OA territory when the author makes his full-text freely accessible for all, online. The rest seems to be based on some conflation between (1) the text of research articles and (2a) the raw research data on which the text is based, and with (2b) software, and with (2c) multimedia — all the wrong stuff and irrelevant to OA).
    Stevan Harnad
    American Scientist Open Access Forum

Specific issues:
My concern was not with just with material in repositories but elsewhere. Some publishers allow posting on green open access on web sites but debar it from repositories. So the concerns remain.
The MIT repository deliberately adds technical restrictions from printing there theses and this also technically prevents data and text mining. There are some hacks possible to get round this but it comes close to dishonesty and illegailty.
“derivative works” is a phrase that doesn’t work well in the data-rich subjects and we need something better. But it’s what the licenses use at present.
In data-rich subjects Linking to repositories is often little use. I need thousands of texts on specialist machines accessed with high frequency and bandwidth.
My problem is not with Stevan’s views but that few others give positive support to them, particularly not the repository managers. Maybe I’m too cautious…

This entry was posted in etd2007, open issues. Bookmark the permalink.

One Response to Stevan Harnad on "open access"

  1. Get the Institutional Repository Managers Out of the Decision Loop
    The trouble with many Institutional Repositories (IRs) (besides the fact that they don’t have a deposit mandate) is that they are not run by researchers but by “permissions professionals,” accustomed to being mired in institutional author IP protection issues and institutional library 3rd-party usage rights rather than institutional author research give-aways.
    The solution is to adopt a sensible institutional (or departmental) deposit mandate and then to automatize the deposit procedure so as to take Repository Managers out of the decision loop, completely. That is what we have done in the Southampton ECS Departmental Repository, and the result is an IR that researchers fill daily, as they complete their papers, without any mediation or meddling by permissions professionals. The author (or the author’s designee) does the deposit and sets the access (Open Access or Closed Access) and the EPrints software takes care of the rest.
    Institutions that have no deposit mandate have simply ceded the whole procedure to IP people who are not qualified even to understand the research access/impact problem, let alone solve it. All they are accustomed to thinking about is restrictions on incoming content, whereas the purpose of an OA IR is to allow researchers to make their own findings — outgoing content — accessible to other researchers webwide.
    The optimal deposit mandate is of course to require Open Access deposit of the refereed final draft, immediately upon acceptance for publication. But there is a compromise for the faint-hearted, and that is the Immediate-Deposit/Optional-Access (ID/OA) Mandate:
    This is the policy that will remove IP-obsessives from the loop: The full-text and metadata of all articles must be deposited immediately, but access to the full-text is set as Open Access if the publisher is Green (i.e., endorses postprint self-archiving: 62%) and to Closed Access if the publisher is not Green (38%).
    For the articles published in the non-Green journals, the IR has the semi-automatic “Email Eprint Request” Button (or “Fair Use Button”), which allows any user who has been led by the metadata to a Closed Access article to cut/paste his email address in a box and click to send an automatic email to the author to request a single eprint for research use; the author then need merely click on a URL to authorize the semi-automatic emailing of the eprint.
    Now, Peter, I counsel patience! You will immediately reply: “But my robots cannot crunch Closed Access texts: I need to intervene manually!” True, but that problem will only be temporary, and you must not forget the far larger problem that precedes it, which is that 85% of papers are not yet being deposited at all, either as Open Access or Closed Access. That is the inertial practice that needs to be changed, globally, once and for all.
    The only thing standing between us and 100% OA is keystrokes. It is in order to get those keystrokes done, at long last, that we need OA mandates, and ID/OA is a viable interim compromise: It gets all N keystrokes done for 62% of current research, and N-1 of the keystrokes done for the remaining 38%. For that 38%, the “Fair Use Button” will take care of all immediate researcher usage needs for the time being. The robots will have their day once 100% deposit mandates prevail and the research community tastes what it is like to have 62% OA and 38% almost-OA world, at long last. For then those Nth keys will inevitably get stroked, setting everything to Open Access, as it should (and could) have been all along.
    It is in that keystoke endgame that all publisher resistance will disintegrate (and they know it, which is why they are lobbying so aggressively against keystroke mandates!). But right now, publishers have unwitting accomplices in institutional IP specialists, reflexively locking in the status quo, blithely ignorant or insouciant about what OA is actually about, and for. That is why ID/OA must be allowed to take them out of the loop.
    Just as I have urged that Gold OA (publishing) advocates should not over-reach (“Gold Fever“) — by pushing directly for the conversion of all publishers and authors to Gold OA, and criticizing and even opposing Green OA and Green OA mandates as “not enough” — I urge the advocates of automatized robotic data-mining to be patient and help rather than hinder Green OA and Green OA (and ID/OA) mandates.
    In both cases, it is Green OA that is the most powerful and promising means to the end they seek: 100% ID/OA will eventually drive a transition to 100% Green OA and 100% Green OA will eventually drive a transition to Gold OA. Short-sightedly opposing the Green OA measures now in the name of holding out for “greater functionality” is tantamount to joining forces with IP specialists who have no sense of researchers’ daily access needs and impact losses, and are simply holding out for what they think is the perfect formal solution, which is all authors successfully negotiating a copyright agreement that retains their right to make their article OA.
    First things first. We are HERE now (85% deprived of research content even for non-robotic use). In order to get THERE (100% of research content OA to researchers and robots alike) we first have to get those keystrokes done. Please help, rather than just hope!

    PM-R: “Some publishers allow posting on green open access on web sites but debar it from repositories.”

    This is the sort abject and arbitrary nonsense that takes one’s breath away! Can these publishers define the difference between a website and a repository? They are just ways that disk sectors are labelled. To block such incoherent stipulations Southampton ECS has formally baptized its researchers’ repository disk sector as their “personal website.” (This is also why I object so vigorously to SHERPA-Romeo‘s slavish and solemn canonizing of every announced publisher “condition” on deposit, no matter how absurd. I stand ready to hear that there is a new SHERPA-Romeo permissions category, colour-coded “chesnut” for those publishers who do not allow deposit of articles by authors who have maternal uncles with chesnut-coloured irises… Here too we detect the familiar mark of the IP gurus…)
    Stevan Harnad
    American Scientist Open Access Forum

Leave a Reply

Your email address will not be published. Required fields are marked *