In my last post I had the presumption to lecture my readership on what “green” and “gold” access mean. Hubris strikes – I got it wrong. I comment on the comments and then continue with why I think “green” is not enough:
PMR: I agree with this and will use it in the future
- CR: In both cases (green AND gold) the permissions set the terms of what you can do. OA journals do not necessarily have licences that allow data mining.
PMR: Also agreed. In many cases OA journals have no explicit permissions at all. In these cases and where I have athe time I engage with the editors to help them clarify the position. Sometimes they realise that they do actually wish to announce permissionFree re-use.
- I’m also not certain that a widely distributed set of repositories (the green road) is particularly resistant to data mining. OAI access should tell you which repositories have data of interest, and you robots can go there.
PMR But they will not know whether they are allowed to mine the data. OAI does not mean Open Access. It means Open Archives Initiative and the Open says nothing about permissions. It is extremely rare (in my experience) that material in OAI repositories carries an explicit statement about re-use. It’s possible to extract Green material from an OAI repository, re-use it, and be sued by a publisher.
- Perhaps the real problem is that (a) licences offered are not those you need for the task (whether green or gold), and (b) those licences are rarely expressed in machine-readable form, even though Creative Commons have encodings to allow this. If licences were so expressed, then you could let your robots wander at will, and mine what they are allowed to!
PMR: I agree with this sentiment but in practice it is unlikely that there will be universal machine-readable licences in OAI repositories any time soon. So in practice roaming the OAI repositories is no use if I wish to re-use and redistribute the material.
- Klaus Graf Says:
April 10th, 2008 at 1:49 pm eI found it was not a good idea by Harnad to choose the same colors as in the “road” metaphor. The last comment shows it is indeed confusing.
* Green road: Self-Archiving in Repositories
* Golden road: OA Journals
* Green OA: cost-free Access (PMR in an earlyer post: FREE access)
* Gold OA: Access without Permission Barriers (preferably CC-BY) – (PMR: OPEN access)
These are independent aspects. Most golden road journals (in DOAJ) are access-green, and CC-BY contents in green road IRs are access-golden.
PMR: Klaus seems to use the terms Green OA and Gold OA in the way I did and also seems to differentiate between Colour-road (how something got there) from Colour-OA (what you can do with it). This seems to conflict with ChrisR and PeterS.
- Peter Suber Says:
April 10th, 2008 at 3:46 pm eHi Peter: Chris is right. There are two distinctions here and we shouldn’t mix them up. One distinction is between green and gold OA, or between OA through repositories and OA through journals. The other is between removing price barriers alone and removing both price and permission barriers. I think you meant to say that removing price barriers is not enough –and I agree with that 100%. But green OA *can* be enough.
Some green OA removes both price and permission barriers, and some gold OA does as well. But also note the converse. Just as some (perhaps most) green OA doesn’t remove permission barriers, some (perhaps most) gold OA doesn’t either. When we work for the removal of permission barriers, we are working to improve both green and gold OA.
PMR: I accept this definition as coming from the fountain of Open truth. Now for the implications (and see If I have learnt OA-101):
- “some green OA removes both price and permission barriers”. This means that authors publish in a subscription journal (i.e. you can only read it if you pay) BUT allows an author to self archive the article and release it under a license where anyone can read it for free and anyone can redistribute it without permission. I think it happens when authors shout loud enough or for special issues and it also happens in disciplines like computer science where everyone republishes their articles with or without permission. But in general it isn’t common and it is of very little practical use (if only because of the difficulty of discovery). It’s of no use for data-mining unless (highly unlikely) the author actually attaches CC-BY or similar.
- “some gold OA does as well”. In my experience – which is limited as I am a chemist and there are essentially no examples – all major Gold OA removes permission barriers. I’m thinking of BMC and PLoS and OUP. They all have CC-BY. There are some journals who have CC-NC and I have argued the case with some but in general this is a minor concern. So which major Gold OA journals forbid re-use? (We should exclude the awful hybrid journals which take money off authors for less than permissionFree). If an author has paid money for OA, which journals forbid their readers to re-use the article?
- “…perhaps most) green OA doesn’t remove permission barriers”. I agree with this.
- “…most) gold OA doesn’t either”. I’m disappointed if this is the case.
My conclusion is that the terms Green and Gold seem to me to be highly confusing and operationally almost useless for a reader. The reader doesn’t care how the material got there – they need to know what they can do with it. For that there has to be a simple set of labels and CC-* provides that.
Finally a word about why it is essential that the NIH continues to mandate deposition in PubmedCentral. (Stevan Harnad has argued that it would be better for authors to self-archive in their institutional repositories). Note that many authors – e.g. from industry – don’t have IRs anyway. But the main point is that it is completely impossible to discover and systematically mine this information. Let’s assume there are ca 60,000 articles deposited in PMC this year, and that there are ca. 10,000 institutions involved. (Evne if it’s only 1000 my argument holds). If I want these I have to set my own list of 10,000 repositories and trawl the lot – every day – for new content. (And I want it daily). And every other text-miner has to do the same. How do I know when a new institution publishes? I have to go to Pubmed anyway, so I might as well read the material there. And the compliance will be awful. The NIH cannot check 10,000 sites on a regular basis. In contrast if the stuff in in PMC (or UKPMC) then I can get a single RSS feed daily which will alert me to the material that comes in. The robots have no trouble trawling this. PMC will presumably alert me to what is minable and what – thanks to the publishers – is not. So I am afraid that self-archiving is a complete non-starter.
April 10th, 2008 at 10:45 am eI don’t think [PMR’s definition] is right at all. Wikipedia says:
“In OA self-archiving (also known as the “green” road to OA [6] [7]), authors publish in a subscription journal, but in addition make their articles freely accessible online, usually by depositing them in either an institutional repository[8] (such as the Okayama University Digital Information Repository[9]) or in a central repository[10] (such as PubMed Central)…
“…In OA publishing (also known as the “gold” road to OA [14]) authors publish in open access journals that make their articles freely accessible online immediately upon publication. Examples of OA publishers[15] are BioMed Central and the Public Library of Science.”