Green OA and Open Data – more

Peter Suber has responded very quickly to my clarification of the connection – or lack of it – between Green OA and Open Data. He has provided some very useful additional information, and I think we are in more or less complete agreement. I reproduce his response and then comment further. (Note that this has nothing to do with the strong/weak OA discussion of 2-3 weeks ago). I’ll start by saying that the word “irrelevant” was probably a poor choice and I’ll try to choose another one below

Green OA and open data
[PMR response snipped…]
PS Comments

  • First, I generally agree with PMR’s opening characterization of green OA. I’d only add that we should distinguish green OA itself from the strategy proposal (which I do not endorse) to slow down on the pursuit of open data until we succeed with open texts. As usual, I think we should proceed on all fronts at once. I generally agree as well with PMR’s understanding of the state of open data in OA repositories. But in describing this state, I’d put the accent in a different place.
  • It’s true that most OA repositories today are optimized for texts and not optimized for data. It’s also true that few institutions (universities, funders, publishers) encourage or require the deposit of data files in repositories. Finally, it’s true that most OA repositories will accept data files, even if few researchers are depositing data files. With this background, my response reduces to to two quick points:
    1. First, it doesn’t follow that green OA is “irrelevant” for open data, merely that we are under-using the opportunities it provides for open data. We shouldn’t confuse researcher practices or institutional policies with repository capacities or green OA. If under-using an opportunity made it irrelevant, then conservation would be irrelevant to climate change and green OA would be irrelevant even to text files.
    2. Second, we have a long way to go to make most repositories as useful for data files as they are for text files. But it doesn’t follow that green OA is irrelevant or harmful for open data, merely that its capacity to help users do useful work with OA data files must continue evolving.

There are many projects trying, in many different ways, to make green OA even more relevant and useful for data than it is now, e.g. by increasing data deposits in repositories and allowing fuller use of data already on deposit. For example, see ASSDA (from ANU), CESSDA (from NSD), Commons of Geographic Data (from the U of Maine), DANS (from the Royal Netherlands Academy of Arts and Sciences), LEAP (from AHDS), LinkingOpenData (from W3C), Pangaea (from a coalition of German research institutions), and StORe (from JISC).

PMR: I agree with all this and thank PS for the list of institutions encouraging data. I’ll try to rephrase:

  • CC-BY and BBB-compliant OA necessarily support Open Data. This is a logical coupling. If you publish an article in this way you automatically give the world permission to use it (and its associated files, by implication) permission to re-use
  • “strong OA” *may* go some way to supporting Open Data. It is possible that some licences will alloe re-use. Note that Non-commercial use is incompatible with Open Data and Open Knowledge as currently defined, and it is possible that some “strongOA” sites offer full removal of permission barriers. Even without full removal, the removal of *some* barriers at least points in the right directin and alerts the community (author/reader/repositarian) to the fact that there are barriers and that some people want to remove them. It’s not logically coupled, but there may be some empathy with Open Data. It’s also possible (though I have no evidence) that strong OA offerings are more conscious of the non-copyrightability of factual data and the value of adding supplemental files to web sites.
  • gold OA is often paid for. Gold OA may remove none, some or all permission barriers. Only the latter leads to Open Data. Since Gold OA requires the author or the funder to pay money, both of them should think hard about what they are getting. PeterS has often noted that certain agreements give authors and readers very little more than what GreenOA offers, but that the author has paid a lot of money. An example is that some publishers expose fulltexts after an initial period (say 2 years) after which GoldOA and Green OA lose any advantage over the freely accessible text.
  • hybrid offerings. I think I have a special concern about hybrid publications (where a joyrnal can contain “OA” and “not-OA”. There was a rash of these last year and they did not impress me – poor value for money, poor clarity of presentation, little additional exposure for authors, poor quality of access to data, etc. It’s not surprisng that – at least in chemistry – there has been almost no take up. I reviewed a random number of these offerings in this blog and felt that funders were paying a lot for relatively little. Admittedly I was taking BBB as a baseline and downgrading anything weaker. In principle a hybrid offering could offer almost nothing other than visibility and charge a lot. The main value that hybrids have over green OA is that they are discoverable on the publishers site which may be important.
  • green OA.  Although logically there is no reason that data cannot be deposited greenOA offers no logical and little social encouragement to do so. If, tomorrow, the whole world had adopted  greenOA but re-use was strictly forbidden then it would be little use to data-rich sciences. That is a caricature, but there is no doubt that BBB – with its insistence on re-use is of much greater practical value in amny subjects.

So, to rephrase:
Open Access and Open Data are logically coupled if and only if the particular flavour and expression of OA requires the removal of all permission barriers. Many people and institutions may indeed jointly promote OA and OD where there is no logical connection; however anything less than BBB may only encourage, not demand Open Data. Apart from BBB-compliance, Open Access has been largely decoupled from the requirement to honour factual data as free from copyright and other permissions.
Open Data cannot and should not rely on progress in Open Access to promote its cause. And, indeed, there may be initially be cases where we can insist on Open Data on supplemental data, and data embedded in fulltext without being able to achieve any remission of permission barriers on the full text itself.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *