petermr's blog

A Scientist and the Web

 

Open Access – why we need Open Bibliography

Stevan Harnad has commented on the discussion on publishing Open Access:

  1. December 20, 2010 at 11:15 am  (Edit)

    Why not just publish in your preferred journal and self-archive the peer-reviewed final draft (“Green OA”)?

For those who don’t know Stevan is one of the pioneers of OA and has been tireless in taking the struggle forward. We agree on many things – the need for Openness of scholarly information and the free (carefully chosen word) access to it. We disagree on details and strategy of achieving the aims.

The Green Road to Open Access should now – I hope – be labeled as “gratis” – “free as in beer”. It’s useful, but I don’t think it’s useful enough in science and I’ll explain why.

But first I’ll commend the Open Access movement on finally coming round to using the terms “gratis” and “libre” (“free as in speech”). For many years the OA movement did not describe how Open Access documents could be used. Obviously if a document is visible on the web a human can read it – while it is mounted – but there is no guarantee of re-use. For example I may violate copyright restrictions if I want to use a diagram in a gratis OA document. This is true whether it’s in a repository or on a personal web page. Moreover repositories are extremely bad (?lazy) at adding formal notices of rights to their contents and the default is simple: “you cannot re-use anything in this repository for any purpose unless explicitly allowed to do so”. That can only be done by adding a formal licence to the documents such as CC-BY or CC0 or PDDL. The Green Road philosophy which maintains that anything publicly visible on the web can be text-mined, reused copied etc. is counter to legal practice and is no defence against being pursued in the courts by the real or presumed copyright owner. We cannot build semantic certainty on legal quicksands. So, unless the author labels the self-archived copy as Libre I cannot afford to re-use it.

Even if the self-archived documents are libre, they are little use to data-driven science, which needs a systematic way of discovering them. Randomly archived documents are not systematically searchable, especially when the percentage of self-archiving is very low. Sometimes this is dictated by publishers who forbid self-archiving (guess which I’m talking about) but the very low level of compliance is the real problem. Almost all scientific publications in closed access publications are not self-archived. Stevan’s argument is that if we all make the right effort we’ll solve the problem – I simply don’t believe this will happen. Some institutions such as QUT and Soton mandate this – and get great reward for doing this, but most universities are incapable of the political effort (I’ll deal with this in later posts).

But let’s assume that everyone DID self-archive their publications. How do we discover them? The journals provide services for searching their own pages, but not surprisingly do not index the self-archived copies. Google, etc. may or may not do a comprehensive job in scraping the academic web but even so you can only use a few results of their search – Google does not provide useful APIs to everyone for free.

The solution is relatively simple to state and create technically. If we create an open Bibliography for scientific articles, then any self-archiving author can add their URLs to this bibliography with almost zero effort. The self-archival into any responsible repository could automatically index their depositions on the Open Bibliography. By searching the Open Bibliography then you discover all self-archived articles. If we are paying repository managers to support self-archiving then they should be providing an index to the reposited material. Everyone benefits – including a forward-looking publisher.

So we have to create an Open Bibliography.

We have the technology.

YOU have to provide the political will.

8 Responses to “Open Access – why we need Open Bibliography”

  1. WHY WE NEED GREEN GRATIS OPEN ACCESS — FIRST

    Peter Murray-Rust and I are — and have always been — on the same team. Our disagreements have not been about the ultimate goals, but about the immediate means of reaching those goals. Gratis OA is free online access; Libre OA is free online access plus various re-use rights. So Gratis OA is a necessary condition for Libre OA: Libre OA is more than Gratis OA, but you cannot have Libre OA without Gratis OA.

    But we do not yet have Gratis OA! Less than 20% of yearly refereed research output is Gratis OA. So the strategic difference between Murray and me is very easy to state and to understand: It is easier to ask researchers, institutions and funders for less than it is to ask them for more, especially when most are not yet providing the less, let alone the more.

    How can researchers be induced to provide at least Gratis OA? Their institutions and funders can mandate that they self-archive their refereed final drafts in their institutional repositories immediately upon acceptance for publication. That is Green, Gratis OA. Making journals OA (Gold OA) is in the hands of the publishing community, not the researcher community, hence Gold OA — whether Gratis or Libre — cannot be mandated; only Green OA can be mandated. Moreover, Green, Gratis OA mandates are in far less conflict with either the policies of most publishers or the desires of most authors.

    Hence (by my lights) the overwhelming priority today for those who seek OA worldwide should be getting Green, Gratis OA mandates adopted by institutions and funders worldwide.

    The rest — Libre OA and Gold OA — will eventually come, once we have universal Green, Gratis OA. But not even Green, Gratis OA will come if we needlessly over-reach now, and insist on more, when we do not even have less.

    As we approach universal Green, Gratis OA mandates worldwide, search and harvesting will become incomparably more powerful than it is now. (It is already very powerful now, with google scholar, citeseerx and other new search engines, despite the sparseness of the OA content base ( <20% ). As OA content becomes less sparse, harvesting and search will become all the more sophisticated and powerful, and the global bibliography Murray recommends will assemble of its own accord, part of the repository deposit procedure and tagging. The problem today is content, not search.)

    As to the re-use of figures: it is already possible (and easy) to write a java or perl script today that will call up a figure embedded in a Gratis OA document. Instead of literally reproducing it in another work, as in the Gutenberg era, the online era allows us to embed a pointer URL that has virtually the same effect.

    For other Libre OA uses (e.g., data-mining), for those resealchers who (for some reason I find rather difficult to understand) feel they have to have more formal and explicit statements of their "harvesting rights" (when everyone else is happily crawling and harvesting the entirety of web gratis content with impunity) in advance, then such researchers will have to wait patiently until we have Gratis OA; once we have mandated it, Gratis OA's own benefits and potential will induce more and more researchers to seek and provide Libre OA, formally. Over-reaching now, when researchers are not even being mandated to provide Gratis OA, will not induce them to provide Libre OA.

    In closing, I would like to remind everyone that we are just beginning to think of freeing research from the constraints of the Gutenberg era of Closed Access (80%), which was (and still is) neither Gratis nor Green. In print days, you could not even access a paper if your institution did not have a subscription to the journal in which it was published (and if you did, all you could do was read it, and use the information, not re-use, re-mix, or re-publish the text or figures). The online era made it possible for researchers to make their papers accessible to all potential users, not just those whose institutions subscribed to the journal in which it was published. The further idea of various re-use rights — and note that there a number of different levels or degree of potential re-use rights, all the way to making it public-domain — was not even thinkable prior to the online era, when we did not even have Gratis OA — because of the genuine economic constraints of the Gutenberg medium.

    So if Libre OA feels urgent now, it is only because the online era has made it possible. But before we try to realize this further possibility, surely we should first secure the benefits of the nearer possibility that the online era opened up for us (free online access) a proximal goal we have a tried, tested and effective practical means of reaching (Green Gratis OA mandates for institutions and funders) rather than continuing to ask for more — without any tried, tested and effective means of getting it.

    Murray could perhaps cite the possibility of adopting stronger Green OA mandates — copyright reservation mandates like Harvard's (about which I am sceptical, because of their opt-out clauses) — but Murray is sceptical about mandates in general (whereas I am only sceptical about mandates that needlessly raise the goal-posts while mandates themselves are still sparse and hard to get).

    The practical question to be asked of anyone who is desirous of immediate Libre OA rather than Gratis OA is: How do you propose to persuade researchers to provide it?

  2. In addition to the pragmatic reasons, related to the well-functioning of science, which I completely agree with Peter, there is also the moral ones.

    The draconian copyright transfer agreement and the mega-restrictive license Wiley, Elsevier, etc. make me sign is only signed by me (I expect not to do it anymore in the future) because:

    1. Committees for tenure, fellowships or grants put pressure on me to have publications with a certain IF.
    2. They have a quasi-monopoly of the publication landscape.
    3. I don’t understand what I am signing, I don’t care, etc.

    The answer to 3 is: Be an adult!
    The answer to 2 is: The key word is “quasi”!
    The answer to 1 is: The committees is us.

    A final point is that not only OA is important, but I also like the publisher being non-profit… I don’t like the idea of private benefits-oriented firms being the ones who decide who gets a position in academia.

  3. Maybe what I am trying to say is:

    - The “producers” of papers are us, the researchers… without us there is no publishing industry.
    - Many of us are in a position that allows us to “do what we want”, within certain limits.

    The conclusion is simple: Why waiting for a change of model? Why not forcing one? Why not doing simply this:

    - Make a list of all OA and CC-BY journals.
    - Submit papers only to this list.

  4. PE asks: \Why waiting for a change of model? Why not forcing one? Why not doing simply this: – Make a list of all OA and CC-BY journals.- Submit papers only to this list.\

    (1) Because there aren’t enough OA/CC-BY journals
    (2) Because the money to pay for it is tied up in subscriptions
    (3) Because authors don’t want to give up their journal of choice for an OA/CC-BY journal just in order to make their article OA
    (4) Because authors don’t need to give up their journal of choice for an OA/CC-BY journal just in order to make their article OA
    (5) Because authors can publish in their journal of choice and also self-archive in order to make their article OA
    (6) Because authors don’t want OA badly enough to self-archive their articles, let alone give up their journals of choice

    And this — yet again — is why what we need first and most is Green Gratis OA self-archiving mandates by researchers’ institutions and funders.

  5. Apologies: in my first response I meant \Peter\ of course, not \Murray.\

Leave a Reply