A number of people have commented on my concern about the re-use of Open Data and suggested that I have put unreasonable restrictions on it. I show two comments and then refer to Klaus Graf who has, I think, put the position very clearly.
Two comments:
PMR: CrystalEye is a highly complex system, not initially designed for re-distribution. It contains probably 3 million files and many 100’s of gigabytes. If each file is spidered courteously (i.e. pausing after each download so as to consume only a single thread) it could take 10 million seconds = 3 months. During that time the database will have grown by 10-15% so that that percentage of links will ipso facto be broken. So any redistribution will involve distributing a broken system. Conversely if the whole DB is zipped into a 100GB file, downloading that is likely to break the server and the connection. So we have to create a sensitive and manageable process.
The data are Open and you can legally do almost anything other than claim you were the progenitor. That’s what Open means. But some of the things you can legally do are antisocial and we are requesting you don’t do them. Failing to respect the “integrity of the work” may not be illegal but it can be regarded as antisocial. The licences do not manage this
Klaus Graf:
http://www.earlham.edu/~peters/fos/2007/11/whether-or-not-to-allow-derivative.html
I disagree with Peter Suber and agree with PLoS and its position:
The Creative Commons web site explains the meaning of “no derivative works” as follows: “You may not alter, transform, or build upon this work”. This is not open access.
Its a clear misinterpretation of Budapest when Subers cites the definition as argument that derivative use isn’t allowed:
The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.
To control the integrity is a moral right and has nothing to do with a license formula. It’s the same as the “responsible use of the published work” in the Berlin declaration which allows explicitely derivative works.
Harnad is denying the need of re-use. Suber has often argued for the reduction of PERMISSION BARRIERS and his personal position to prefer a CC-BY use is honest but his opinion that CC-ND is compatible with BBB and also OA is absolutely disappointing. And it’s false too.
PMR: I agree with Klaus. I believe that PERMISSION BARRIERS must be removed. Whatever the moral arguments about PB I think there are also utilitarian ones. Open Access and Open Data are sufficiently complex already that differential barriers are counterproductive – they confuse people. There is also enough evidence that many publishers pay lipservice to OA by producing overpriced substandard hybrid products. If CC-ND is seen as OA then it is easy for the publishers to claim that any visible document is OA. There must be clear lines and I think CC-BY is where they are.
(And yes I have asked that my licence on this blog is changed to CC-BY)
November 2nd, 2007 at 2:31 pm e[…] In this case for CrystalEye you have people asking you for the data, they are OpenData but now your concern over forking appears to be the problem with sharing the data. I wish you luck resolving this so that we can access the data. Otherwise we will initiate our scraping as you suggested and it will fork anyway.
November 2nd, 2007 at 7:28 pm eIt boils down to the question of how truly “OPEN” are those open data, Peter, when you start expressing concerns about sharing those data, i.e. the discussion about forking.