In previous posts (see links in Why weakOA and strongOA are so important) I have welcomed the Suber-Harnad approach to OA, labelling obejcts either as "strong OA" or "weak OA". In this post I want to explore what strong OA is. I believe this is possible and relatively simple. I hope that all OA advocates will be able to agree on an operational procedure that will simply and absolutely determine whether something is strong OA.
A useful starting point is the Wikipedia "definition". I have copied this verbatim and added two suggested clarifications:
Open access (OA) is free, immediate, permanent, full-text, online access, for any user, web-wide, to digital scientific and scholarly material, primarily research articles published in peer-reviewed journals [PMR: and academic theses]. OA means that any user, anywhere, who has access to the Internet, may link, read, download, store, print-off, use, and data-mine the digital content of that article [PMR: without requiring to consult authors, publishers, or hosting sites]. An OA article usually has limited copyright and licensing restrictions.
PMR: The Budapest declaration includes the definition:
By "open access" to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.
For background let's assume some axioms:
- we must have an operational procedure for determining the strongOAness of an object. Without the procedure we have argued endlessly over things that now will not take up our energies. The only people who will argue are those who wish to muddy the OA waters, including those who wish to rent weakOA objects for the sale price of strongOA.
- we can explore the discussion in the arena of scholarly and research publications. There will be an overlap with certain digital objects (data sets, computer code) but we'll omit discussion here. The most important artefacts are research articles (normally peer-reviewed) and theses (of all types, undergraduate, masters, doctoral) published through a University or scholarly organisation.
- that there are overriding statements of intent which also contain definitions. These include the BBB declarations [above] and the The Open Knowledge Definition (which is part of the basis of Open Data and Science Commons). I believe these all describe strongOA (and it would be difficult to dumb them down without breaking my idea of strongOA). It is how to translate these definitions into practice that I address here.
I see the following challenges for strongOA.
- The logical consequences of strongOA are extensive. I believe is possible for anyone to download a complete journal, repackage it and resell it without the publisher's or author's permission. They must, of course, preserve the provenance (authorship) but that's all that is formally required. Just as people re-use and resell my cmoputer code (as in Bioclipse) they can do the same with my articles and - theoretically - the whole OA content of, say, PLoS or BMC. In practice I think that would be slightly questionable and that's where community norms come in - it's useful to say "you may do this but we'd rather you didn't". I generally enforce this by adding the bit-rot-curse to my code. So there will be a culture change as publishers adopt strong OA - there will be mistakes - and we should help them adjust.
- There may be a tendency to blur the boundary. "This article is strongOA as long as it is for non-commercial use". No. It is either strongOA which requires the permission for commercial use or it's not strongOA. We have to agree on this.
- We have to police strongOA. One of the plus points of Open Source and the weak points of OA (up to now) has been the policing. If you say something is strongOA and it isn't someone should take you to task (gently if it's a mistake). If we don't do this then the bright shiny present that the Suber-Harnad terminology has created will tarnish. Fuzzy practice begets fuzzy thinking.
- We have to be able to know (not just guess) the strongOA status of an object at all times. This is critical. I shall continue to stress this. It's not good enough to say "I am emailing this document from repository X which is classified as an Open Access repository, so you can do anything you like with it". The document/artefact has to announce that it's strongOA. Nothing else will do because provenance by association gets lost. The only way that I know of doing this is by embedding a licence or reference to a licence in the document. Typical licences include CC-BY, GPL document licence, Science Commons/Open Knowledge (meta)licences such as PDDL, or CC0. The licence can be asserted either by embedding RDF in the XML/HTML or adding an approved icon from the organisations above.
To summarise at this point:
strongOA requires a clear borderline defined by a licence (or licence reference) embedded in the document and policed by the scholarly community.
This discussion has been about what strongOA is, not whether it's a good thing or how to achieve it. It ought to be something that responsible publishers have a view on as well as authors, funders, repositarians, human readers and machine users.