Shared data? Open data?

Heather (Research Remix) asks a key question:

15:49 18/07/2007, Heather Piwowar,

Quick wondering. My research is on data re-use. I struggle with what to call the source datasets. I’d like to call them “open data” but they aren’t, necessarily. Sometimes not free, and usually not open in a licensing sense. I’ve been calling them “shared data” which seems ok, but isn’t mainstream and so doesn’t help link the work in to others who are perhaps interested in the same ideas. Publicly-available data? Even more unwieldy.
I’m on the lookout for a better phrase. Let me know if you have any suggestions?

It’s very clear from recent explorations into mainstream publishers (see many posts on this blog) that the English language is broken for accurate description of right-to-access and right-to-use. “open” “access” “free” “read” are all essentially Humpty Dumpty words.

“When I use a word”, Humpty Dumpty said, in rather a scornful tone, “it means just what I choose it to mean — neither more nor less.”

WP describes this as “he discusses semantics and pragmatics with Alice.”
We cannot and must not continue to use common English (or any other natural language) to describe what we mean in access-to and reuse-of data. Bill Hooker writes:

“Open Access” is not a marketing phrase and you are not free to use it as you see fit.

I remember going to a presentation by a closed source software manufacturer who described their system as “Open architecture”. When I asked if I could see the documentation they said no and I was (correctly) informed that he and I meant different things by “open”. He meant that if a customer bought a product they now got documentation telling them how they could access the functionality – i.e. it was no longer a “black box”. Obviously I overload “open” in a software context to mean “Open Source”.
The problem has been largely (but not completely) solved in software. If I am told something is “Open Source” I Immediately ask “what licence?”. I can then go to the Open Source Software Institute and find out the terms and conditions of the licence. This is our only way forward.
We therefore have to put precise labels on our research output – initially papers and data. I had (naively) thought that the relative lack of progress from publishers was inertia and ignorance and that when it became clear this was necessary they would accept the challenge of describing their output more clearly. It is now clear this will not happen and that the publishers (apart from the aggressive Open Access publishers) are part of the problem, not the solution. Publishers copyright data that does not belong to them. Publishers cut off subscribers who try to download data. Publishers blaze around “free” “choice”, etc. which confuse rather than inform. For a publisher “open” and “free” are to be used like “low fat” “energy food” “healthy” as a way of legitimising current practice.
Heather, the solution lies with Science Commons (a project of Creative Commons) and the real Open Access publishers. (Classic Creative Commons is the right philosophy but the licenses were created for creative works, not scientific data and the licence don’t fit very well. However I would far rather see a CC-BY on a table of melting points than Copyright Wiley).
To all authors out there who wish their data to be re-used and not owned and resold by a publisher, just add CC-BY to your data. It’s not perfect but it works.
I shall return to the actual implicit and explicit licences of publishers in a post in the near future.

Posted in data, open issues | 1 Comment

Moderatorial

Occasionally I write a “moderatorial” – a commentary on any list or blog I am running.
When I started this blog I had no idea where it would go – programming, puzzles, diversions. Over the last two-three months it seems to have concentrated on problems in doing data-driven science – the ChemZoo episodes, the hybrid access publishers, etc. I am clear the world is on the brink of enormous things with data-driven science and various forces are dragging it back into the twentieth century. Which is why I feel strongly and use robust language.
Occasionally this upsets people deeply (not all mail is public) and this concerns me – I do not want to give the impression that I do not listen or may be on a soap-box. But I have been very upset by the failure of the closed access publishers to address hybrid access positively (and I shall blog on this later). I have many friends and collaborators in the publishing industry and we are working closely with several of them – including Nature –  but at the same time I am opposed to many of their industry’s practices – I shall summarise in a later post.
Sometimes language which is not in itself upsetting can be misread (I have a wonderful story which I shall relate later of a misunderstanding in the Moo in Diversity University – and there is a slight element of this here). So although I regret upsetting people – I said what I felt needed to be said. I thank Bill Hooker for commenting – which says what I feel:
BILL:

Peter Murray-Rust recently pointed to Paul Wicks’ (Nature Networks) blog article, “Is Publisher-Lead “open access” a swindle?“, which refers to PMR’s recent blog series on publisher licensing and permissions barriers in hybrid OA models. In comments on Paul’s entry, Jennifer Rohn pointed out

The two dedicated open-access publishers (BioMed Central and Public Library of Science) don’t have these problems. People who want to ensure their articles are truly going to be open access, published by companies who have put real thought into the publishing as well as business model, might want to look there.

PMR quoted that comment, to which Maxine Clarke replied (in a comment on PMR’s entry) with what looks for all the world like classic publisher anti-OA FUD:

Hello, I declare conflict of interest as I am an editor at Nature, not in itself open access but our publisher has many open access projects and products.
In response to Jennifer’s point: I agree that BMC has got an OA publishing/business model and indeed business, but the PLOS model is dependent on a large grant from a charitable foundation, so the jury is still out (in my opinion). As an editor I am concerned about the archiving and the preservation of the scientific record, for example.

I note the commendable upfront COI declaration and state for the record that I do not think Maxine was consciously engaging in FUD. It is nonetheless standard operating procedure for OA opponents to link PLoS to “charity” and cast vague aspersions on the ability of OA publishers to maintain the scientific record. PLoS was intended as a flagship-cum-icebreaker for OA; breaking even financially was always a secondary objective. Nay-sayers about the viability of OA in business are invited to explain the success of (at least) BioMed Central, Hindawi and Medknow. Persons who wish to claim that OA puts the record at risk are invited to explain how a proprietary archive in the hands of a for-profit publisher is safer than PubMed Central or the wide network of repositories linked by OAI-PMH. (Again, I don’t think Maxine was making such anti-OA claims, but it bears pointing out that what she did say contains clear echoes of standard FUD.)Peter MR’s response to Maxine’s comment was this entry, in which Peter sets out to find the “many open access projects and products” and gets no further than did Jonathan Eisen, who praised the establishment of Molecular and Systems Biology (NPG’s only OA journal) only to find that in fact the MSB license is the same as CC-BY-NC-ND, which is far too restrictive to call itself OA. As Chris Surridge (of PLoS) puts it in comments on Jonathan’s entry,

‘Free Advertising’ isn’t ‘Open Access’ in my book.

Maxine had this to say:

Nature Precedings, several database publications, Nature Reports publications (3), Nature Network, Scintilla, online daily news service, gateways, blogs, many individual articles and collections of articles are freely available (“projects and products” as I mentioned in my comment to your earlier post. MSB is to my knowledge NPG’s only formal open access journal.)

Peter responded with another post, giving the necessary background and pointing out that, excepting MSB,

…the rest of [Maxine’s] list completely muddies the “open access” debate. If Nature believe that “open access” applies to any freely visible information on their site, most not peer-reviewed, many without licences and many with the publisher’s copyright, then they are making my life much harder.

This is clear and unexceptionable in the context of Peter’s ongoing quest for clarity in publisher OA-related policies. That context, or at least its existence and importance to the entry in question, was made clear by the entry itself, and I take ordinary netiquette to involve being familiar with an ongoing conversation before taking part. Nonetheless, Maxine again:

frankly I was not responding to anything you have written in the past few weeks, I was responding to your request to give examples of NPG’s “open access” or “free” material.

This is weak at best. Peter asked for “pointers to [Nature’s] open access products and the licences which they carry”; see also netiquette, ongoing conversations and. Claims of a limited response made in ignorance of context are either disingenuous or, if made in good faith, still no excuse.Maxine continues:

It is your perogative to define terms however you like, but not your perogative to enforce other people to use the same definitions – I know what I mean by “open” or “free” content and I don’t need to be told off by you for having a different definition to whatever your definition is

I don’t know and I don’t care what Maxine means by “open” or “free”. I care what the BBB Declarations mean. Peter is not defining terms however he likes; he is working with published, widely accepted definitions. He is well within his rights to expect that other people will indeed use the same definitions: that is, after all, the point of having developed and published them. Nature does NOT have “many open access projects and products”, it has one (barely) OA journal and the excellent Precedings, together with a number of commendable free-to-read initiatives (blogs, Nature Network, the various free-to-read web special collections, etc). “Open Access” is not a fuzzy buzzword that Maxine is free to define as she sees fit, and if she is going to start abusing it as marketing for Nature then she most certainly does need telling off. Peter has apologized for being “over-brusque”, which is a handsome gesture but in my opinion no such apology was called for.

PMR: …

Posted in open issues, Uncategorized | Leave a comment

Apology to Maxine Clarke and Nature

I have been over-brusque and apologize to Maxine Clarke who has pointed out:

I understand from your post above that you feel my response listing open publications and products is too fuzzy and does not match with what you have been writing in the past few weeks, but frankly I was not responding to anything you have written in the past few weeks, I was responding to your request to give examples of NPG’s “open access” or “free” material. I think you are blowing my response out of proportion because it did not happen to fit into however you have been defining the terms of the discussion. It is your perogative to define terms however you like, but not your perogative to enforce other people to use the same definitions – I know what I mean by “open” or “free” content and I don’t need to be told off by you for having a different definition to whatever your definition is — similarly, you are welcome to your own views and I shall not castigate you for them in a blog posting ;-)

I read her comment as replying to the large amount of material I have posted here illustrating the problems of deciding what is meant “open access”. (This matters because it is something that many publishers charge authors for). The uncritical use of “open access” to define a wide range of information products in the industry is very unhelpful. Maxine was unaware of my previous and continuing concern about the need to define this precisely – I was unaware that she was not continuing the debate.
Occasionally this issue raises strong feelings 🙂

Posted in Uncategorized | 5 Comments

Travels of the Blue Obelisk Greasemonkey

While researching about Open Access I visited the TOC for Nature’s MSB. (In passing, none of the articles are flagged in the TOC as Open Access, though they all actually carry a CC-licence and the journal masthead announces that this is an OA journal.) But that’s not the point of this story. Here is what I actually saw:
grease.PNG
What are the Pg and Cb all over the TOC. When you bring up the page they aren’t there! What’s happened? Well the chemical blogosphere has posted about several articles here and mentioned their DOIs. The Blue Obelisk has developed a Gresemonkey script (which is a Firefox plugin) which reads the TOC and sees if any DOIs have been mentioned in the Chemical Blogosphere. And, in this case, three articles have been. If you mouse over them you will see the first few lines of the blog post (blue box). And clicking on the links takes you to the blog post. So if you want to see what the world thinks of your paper (and that is so much more immediate than the citation system) install the greasemonkey – it’s easy. (It’s at your own rsik – Greasemonky used to have a security hole,  but that has been fixed).
Cb is Chemical Blogspace. Pg is Postgenomic.
And whenever I see these icons I get a sense of the Blue Obelisk community.
This is yet another example of how the blogosphere is generating new forms of scientific reporting, criticism and review. The greasemonkey will help to change the way we report science.

Posted in blueobelisk | 9 Comments

"open access products" at Nature obscures the debate

In a recent post Why Open Access metrics are necessary – July 16th, 2007 I quoted Paul Wilks (Is Publisher-lead “open access” a swindle?) where he detailed how the the obscurity of language and procedures in closed access publishing could lead to authors paying high charges for services that added little to their current rights (i.e. to post preprints or postprints without charge). [Note that not all publishers allow these rights.] He puts it very clearly:

Anyway. The current state of affairs seems to be this: publishers are worried about OA and have cobbled together business models that support generating revenue in other ways that the typical subscriber model. However, they don’t appear to have put much thought in to the publishing model.

In reply Jennifer Rohn said:

The two dedicated open-access publishers (BioMed Central and Public Library of Science) don’t have these problems. People who want to ensure their articles are truly going to be open access, published by companies who have put real thought into the publishing as well as business model, might want to look there.

to which Maxine Clarke (Publishing Executive Editor of Nature) replied:

Hello, I declare conflict of interest as I am an editor at Nature, not in itself open access but our publisher has many open access projects and products.
In response to Jennifer’s point: I agree that BMC has got an OA publishing/business model and indeed business, but the PLOS model is dependent on a large grant from a charitable foundation, so the jury is still out (in my opinion). As an editor I am concerned about the archiving and the preservation of the scientific record, for example.

and to which I replied (Open Access publishing at Nature) asking for details of these “open access products” and Maxine has replied:

Nature Precedings, several database publications, Nature Reports publications (3), Nature Network, Scintilla, online daily news service, gateways, blogs, many individual articles and collections of articles are freely available (”projects and products” as I mentioned in my comment to your earlier post. MSB is to my knowledge NPG’s only formal open access journal.)

Before commenting I recap that my posts on this blog over the last two weeks have been to find out what publishers mean by “open access”, whether this is clearly defined in their public pages, and whether it is (in my opinion) consistent with “Open Access” as defined by BBB. Beyond that I made no comment on publishers’ business models, the rightness or wrongness of closed access. Simply whether the position was clear
– i.e. whether there was a clear publishing model. My concern with almost all the closed access publishers in chemistry was that their publishing models are awful. It is almost impossible to know what is going on. By contrast some (but not all) Open Access publishers have very clear publishing models and Jennifer quite rightly lauds BMC and PLoS for their publishing models.
Nature has, of course, an excellent reputation as a publisher of science – i.e. the scholarly process of review and publication. It also has a good reputation in science journalism – i.e. reporting on science, but not as part of the peer-reviewed process. For example I was grateful to Emma Marris at Nature for reporting my concern on the ACS’s attempt to have PubChem (sic) restricted from publishing chemical structures.
I am desperately trying to get clarity in assessing the current practice of “open access”. I had also hoped to move on the Open Data. I had assumed that I would find it easy to get reliable information from the publishers, from their sites, their practices and their comments. It has been awful. I omitted Nature from my studies (which I and the Blue Obelisk some day hope to publish in a peer-reviewed journal) beacuse they publish no “open access” chemistry, even though a hybrid scheme.
Maxine’s comments are fuzzy and do Nature’s reputation no credit. She cricizes PLoS’s business model which is irrelevant to their publishing model. Her remarks could be read as implying that PLoS are incompetent in preserving the scientific record. I do not understand this – I assume that all PLoS papers are archivable by institutions, individuals or abstracters such as Pubmed. But in any case this argument is motivated not by scholarship or journalism, but by marketing.
She then goes on to list “open access products” above. (I omit MSB from my studies as it is not chemistry, but I am prepared to accept the assertion of Jonathan Eisen that it is CC-BY-NC-ND. At least there is a licence which is clear and we can debate whether this is “Open Access”, “open access” or “free access”.
But the rest of the list completely muddies the “open access” debate. If Nature believe that “open access” applies to any freely visible information on their site, most not peer-reviewed, many without licences and many with the publisher’s copyright, then they are making my life much harder.
I had hoped for objectivity and possibly even help from a major publisher which has, in the past, commented responsibily on “Open Access”. Now I get what in the UK is called “spin”.

Posted in open issues | 2 Comments

Can Open Data be manipulated?

Chrsi Rusbridge – who runs the Digital Curation Centre – has raised the question of whether making data Open increases the risk of fraudulent manipulation of content:

Open Data… Open Season?

Peter Murray Rust is an enthusiastic advocate of Open Data (the discussion runs right through his blog, this link is just to one of his articles that is close to the subject). I understand him to want to make science data openly accessible for scientific access and re-use.
PMR: Correct!
It sounds a pretty good thing! Are there significant downsides?
Mags McGinley recently posted in the DCC Blawg about the report “Building the Infrastructure for Data Access and Reuse in Collaborative Research” from the Australian OAK Law project. This report includes a substantial section (Chapter 4) on Current Practices and Attitudes to Data Sharing, which includes 31 examples, many from the genomics and related areas. Peter MR wants a very strong definition of Open Access (defined by Peter Suber as BBB, for Budapest, Bethesda and Berlin, which effectively requires no restrictions on reuse, even commercially). Although licences were often not clear, what could be inferred in these 31 cases generally would probably not fit the BBB definition.
PMR: Although BBB is the most straightforward philosophy for data re-use it is not the only approach. I am promoting it at present because I feel that a large number of scientists create data which they would like to be made available under – say – a CC-BY licence. But I fully accept that in some disciplines re-use of data may have to be governed by additional principles, especially where it has to support regulatory processes or involves human data.
However, buried in the middle of the report is a cautionary tale. Towards the end of chapter 4, there is a section on risks of open data in relation to patents, following on from experiences in the Human Genome and related projects.

“Claire Driscoll of the NIH describes the dilemma as follows:
It would be theoretically possible for an unscrupulous company or entity to add on a trivial amount of information to the published…data and then attempt to secure ‘parasitic’ patent claims such that all others would be prohibited from using the original public data.”

(The reference given is Claire T Driscoll, ‘NIH data and resource sharing, data release and intellectual property policies for genomics community resource projects’ Expert Opin. Ther. Patents (2005) 15(1), 4)
The report goes on:

“Consequently, subsequent research projects relied on licensing methods in an attempt to restrict the development of intellectual property in downstream discoveries based on the disclosed data, rather than simply releasing the data into the public domain.”

They then discuss the HapMap (International Haplotype) project, which attempted to make data available while restricting the possibilities for parasitic patenting.

“Individual genotypes were made available on the HapMap website, but anyone seeking to use the research data was first required to register via the website and enter into a click-wrap licence for the use of the data. The licence entered into, the International HapMap Project Public Access Licence, was explicitly modeled on the General Public Licence (GPL) used by open source software developers. A central term of the licence related to patents. It allowed users of the HapMap data to file patent applications on associations they uncovered between particular SNP data and disease or disease susceptibility, but the patent had to allow further use of the HapMap data. The licence specifically prohibited licensees from combining the HapMap data with their own in order to seek product patents…”

Checking HapMap, the Project’s Data Release Policy describes the process, but the link to the Click-Wrap agreement says that the data is now open. See also the NIH press release). There were obvious problems, in that the data could not be incorporated into more open databases. The turning point for them seems to be:

“…advances led the consortium to conclude that the patterns of human genetic variation can readily be determined clearly enough from the primary genotype data to constitute prior art. Thus, in the view of the consortium, derivation of haplotypes and ‘haplotype tag SNPs’ from HapMap data should be considered obvious and thus not patentable. Therefore, the original reasons for imposing the licensing requirement no longer exist and the requirement can be dropped.”

So, they don’t say the threat does not exist from all such open data releases, but that it was mitigated in this case.
Are there other examples of these kinds of restrictions being imposed? Or of problems ensuing because they have not been imposed, and the data left open? (Note, I’m not at all advocating closed access!)

PMR: Digital curation is hard – it is one of the hard challenges of this century, and it is critical that organisations such as the DCC exist. I don’t have answers in all cases. The following may meet many requirements:

  • the author creates a definitive, signed, version of the data and reposits it in an institutional  or (possibly better) domain repository. XML has a mechanism for canonicalization and signatures.
  • software is used which is able to compare any version of a document with the definitive version. XML provides this.
  • there are domain-specific repositories. This is the hard part – it costs money. However it is well served in bioscience at present.

The question is similar to plagiarism. If the data are available it is easier to manipulate them. But it is also easier to detect any mistakes or fraud.
There is challenge as to whether it is possible to create a licence which restricts the use of the data but not its dissemination. If, for example, I get the latest version of the fundamental constants (e.g. the speed of light) from NIST it is not unreasonable that I cannot change these without permission – certainly not if I want to maintain they came from NIST. So there is a role for certified immutable reference documents under a BBB philosophy. I think it should be limited.
I faced this problem when I released some of my code under a GPL licence. I normally use a non-viral one, but this was a derivative work of a GPL program. I was worried that people might make derivative works of CMLSchema, which defines CML and thereby corrupt the practice of CML. So I said that anyone can change the code, which they can, but required that they make an announcement that the result could not be considered CML. The GNU software auditors approached me and said that I could not impose this restriction under the GNU licence. I changed “require” to “request”.
So, in conclusion, Chris – I am not worried beyond that fact that I think digital curation is extremely hard, must support Open Data as much as feasibly possibly, and the cannot look to the commercial (publishing) sector which wishes to aggregate and possess. The inability to get closed data, however well “curated” (and I don’t believe it normally is) is far more damaging.

Posted in data, open issues, Uncategorized | 1 Comment

Open Access at Copernicus

In further travels I have come across Copernicus GmbH, “A Spin-Off of the Max-Planck-Institut für Aeronomie und Sonnensystemforschung”. This is listed in DOAJ and publishes Atmospheric Chemistry and Physics (which is of interest to me as I have a colleague who is translating legacy data into CMLReact – you can see why I want to get access to this data). The licence is clear:

Personalized Copyright under the Creative Commons Attribution, NonCommercial and ShareAlike Licence

That’s it. 15 seconds on the website is all I need. Whereas I have spent HOURS trawling through some of the other publishers trying to figure what is going on.
Since this post is so short, I can add some cost information – and I mean cost. The journal offers 6 different pricing schemes. They differ only in the technologies used to author the documents. Submit a hamburger and it costs more than to submit a cow. It represents the COST of processing the article. The worse the manuscript, the longer it takes, the greater the anger of the technical editor. It costs more. And look closely, there is a factor of 250% between the best and the worst. That means that most of the COST is in processing difficult manuscripts. As it should be.

Service Charges
Atmospheric Chemistry and Physics is committed to the Open Access model of publishing. This ensures free web access to the results of research and the maximum visibility for published papers. Copernicus Publications is also committed to fast publication with each paper published on the web as soon as possible at each stage of the review process. The Open Access model is supported by many international initiatives for scientific publishing. However it requires the author to pay the costs of the review process, typesetting, web publication and long term archiving. Atmospheric Chemistry and Physics is using automated software to keep down costs as much as possible. Author(s) can also choose how to submit text and figures in order to keep total costs to a minimum. The costs for the production, handling and distribution of printed and/or CD-ROM issues, however, are covered separately by subscriptions.
For journals with a public discussion part, such as for ACP and ACPD, the publications first in the discussion part and, second, in the main part are regarded as one publication in the “classical” sense where the evaluation is not public. Therefore, the above service charges will be levied for the publication in the discussion part, while the publication in the main part will be free of charge, if the revised article is submitted along the same category (or lower) as the original manuscript; otherwise the difference in the service charges will be levied. Please consider that a discussion paper has up to three times more pages than the corresponding paper in the “classical” two column style due to the interactive screen format for improved online reading (landscape, bigger font).
The payment of service charges includes:

  • Online manuscript registration and submission
  • Rigorous but fair peer-review, incl. public comments for Discussion Journals
  • Professional processing of figures and movies
  • Typesetting, editing and formatting in pdf LaTeX
  • LaTeX macros and WORD templates for free use
  • Immediate Open Access publication of each article
  • Article alert service
  • Inclusion in the Copernicus Online Library (incl. mirror-servers and backup facilities) as well as in international scientific databases, index and reference machines
  • Long-term e-archiving via Portico as well as printed archiving via copyright libraries worldwide.

PMR: Unfortunately the website didn’t paste into the post well and you will have to visit their site:
Essentially a manuscript prepared in their preferred TeX format costs 23 EUR per page and one sent on printed pages through the mail costs 68 EUR per page. And various flavours of Word are intermediate.

Sample Calculation
The original manuscript shows an amount of 10 pages, the author follows the guidelines according to Category 1. The Discussion Paper will then have ca. 30 pages calculated with 23,- EUR net per page = 690,- EUR + German VAT. The final revised paper in the main journal will then be free of charge.
Copy-editing Charges
For copy-editing (requested by the Editor) an additional fee will be added of:

  1. 10,- EUR net per full page in the Main Journal (peer-reviewed), and
  2. 3,- EUR net per full page in the Discussion Journal

As my current posts are concerned with licence clarity and not cost I shan’t comment further. But I commend Copernicus for clarity of licence (though I disagree with their choice and gently suggest CC-BY – what is there to lose?) and for a self-explanatory and compelling cost model.
And , who knows, when Hannah has developed CMLReact for atmospheric reactions, there’s an obvious first choice of journal.
And, maybe after that, there will be a category 0:
The text file is compiled in accordance to the Technical Instructions for CML-XML as .xml and transformed into …” 15 EUR/page.
🙂

Posted in chemistry, data, open issues | Leave a comment

Open Access publishing at Nature

In these posts I am trying to be as objective as possible in that I am investigating the provision of Open Access, “open access” and the consistency of a publisher. I am not being systematic as I have been sticking to chemistry and have been through most of the major “closed” publishers who publish single subject journals. I have not looked at Nature as they publish relatively little single-subject chemistry. However Maxine Clarke has posted a comment on this blog:

Name: Maxine | URI: http://blogs.nature.com/nautilus | IP: 194.129.50.189 | Date: July 16, 2007Hello, I declare conflict of interest as I am an editor at Nature, not in itself open access but our publisher has many open access projects and products.
In response to Jennifer’s point: I agree that BMC has got an OA publishing/business model and indeed business, but the PLOS model is dependent on a large grant from a charitable foundation, so the jury is still out (in my opinion). As an editor I am concerned about the archiving and the preservation of the scientific record, for example.

I do not moderate comments (other than the 300 spam day-1) and am happy for Maxine to comment on Jennifer’s post. Jennifer can reply if she wishes.
My current task will now be to find the “… many open access projects and products.” There is no obvious masthead on Nature pointing to Open Access and I am not aware of much activity about Nature Open Access on Peter Suber’s blog (or indeed from Nature). The best I have found is Jonathan Eisen’s blog:

Wednesday, May 23, 2007

Is Nature going Open Access?

Nature and EMBO are together publishing “Molecular Systems Biology” and all basic research in this journal is Open Access. I am wondering why this has not gotten more press as it seems Nature is experimetning with OA models here. Nature has done some experimentation previously by making certian types of papers available freely (e.g., many genomics papers). But this is definitely one step beyond and they deserve massive kudos for it.
So if you are looking for a new OA journal to submit some systems biology related papers, you should try here. And maybe with a little effort, we can convince Nature it is worth doing for more of their journals.

PMR: and some comments:

Pedro Beltrão said…

I think because Nature did not yet realize that it would be good press for them to openly state it as an experiment into the open access model. Apart from the web publishing group (Timo Hannay’s group) things at Nature seem to move a bit slow.
5/27/2007 7:22 AM  
Jonathan Eisen said…
I am sure you/I would like them to move faster. But this is a good start. Better than many other publishers.
5/27/2007 7:38 AM  
Jonathan Badger said…
And to think the most absurd idea in your April Fools article was the proposed PLoN…
5/27/2007 8:30 AM  
Chris said…
This is also way more ‘open’ than many conventional publishers ‘open’ options as they are publishing under a Creative Commons licence:
“This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits distribution, and reproduction in any medium, provided the original author and source are credited. This license does not permit commercial exploitation or the creation of derivative works without specific permission.”
Strictly speaking that is the Attribution Non-commerciual licence. This is almost (though not quite) the most useful/least restrictive of the CC licence. PLoS’s licence (the full Attribution licence) says:
“This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.”
Hopefully that distinction will never become important.
5/29/2007 3:35 AM  
Jonathan Eisen said…
It seems somewhat unnecessary for them to have used that particular CC license instead of the more open one, but I am sure they have their reasons. Hey, they are much closer than most other places …
5/29/2007 6:46 AM  
Chris said…
At the risk of being overly obsessed by this I checked over at Creative Commons and the MSB licence is actually the “Attribution Non-commercial No Derivatives” licence which CC describe as
“the most restrictive of our six main licenses, allowing redistribution. This license is often called the “free advertising” license because it allows others to download your works and share them with others as long as they mention you and link back to you, but they can’t change them in any way or use them commercially.”
So only slightly better than ‘free-to-read’ and not available for repurposing such as translation to other languages, including parts in educational material, reusing figures from the papers, etc, unless you ask permission or place your faith in a ‘fair use’ defense.
‘Free Advertising’ isn’t ‘Open Access’ in my book.
5/29/2007 8:13 AM  
Jonathan Eisen said…
aargh … oh well, it was only a dream
and in reference to J. Badger’s post above … I have no idea what you are referring to when you say “my” April Fools article. I just posted it. I would never have written anything so scandalous.
5/29/2007 8:22 AM

PMR: So here we are with apparently less-than-total-freedom. I am campaigning for CC-BY (== Attribution) as the mainstream scientific license and am still trying to find out how many of the “open access” chemistry journals are CC-NC or worse. Be quite clear, CC-NC restricts science. CC-ND is worse. It destroys the re-use of scientific data.
Maxine I have only transcribed Jonathan’s blog. I don’t know offhand which other journals NPG or is it Macmillan actually publish. It’s quite difficult to find out. I appreciate the important and valuable experiments doen by Timo and colleagues,  and their use of Open Source (sic) code, but I’m more interested in these current posts in journals. So I’d be grateful for pointers to your open access products and the licences which they carry.

Posted in open issues | 4 Comments

Corrigendum to earlier post on access to Blackwells/IUCr

Peter Strickland from the International Union of Crystallography (with whom we work closely) has explained the problem with the confusion between “BUY” and “open access” labels on IUCr content. [Post in full:]

Name: Peter Strickland | URI: http://journals.iucr.org | IP: 192.70.242.71 | Date: July 16, 2007
Peter [MR] – the problem with the issue you tested was a technical
one with some articles being labelled “open access” that were
not intended to be (and that the access control system did handle
accordingly). Your blog was the first indication we had that this
problem existed, and we have subsequently fixed it. Note that the
journal (Section E) will become fully open access in 2008. Note
also that for the IUCr journals, it is the IUCr alone that sets
an open-access policy; you should not make any generalisations about
Blackwell from our journals alone. Thank you for highlighting the incorrect labelling.

Thanks very much PeterS and I apologize to you, IUCR and – on this issue – Blackwells.
I do what I can to be accurate and know that it is dangerous to draw too many conclusions from a small numner of observations.

Posted in open issues | Leave a comment

Why US citizens need to lobby the House

I am not a US citizen so cannot influence any representative about the NIH bill (see my post US citizens: please lobby for House vote on OA mandate next Tuesday). But in case you think this doesn’t matter, here’s the sort of thing publishers were saying to the UK government 3 years ago (see my last post). From the select committe report. [Mr Robert Campbell == President, Blackwell Publishing; Dr John Jarvis == Senior Vice President, Europe, Managing Director, Wiley Europe Limited]
PMR: The chairman is Ian Gibson, MP. The first shows the tenacity from Gibson and the flanelling from the publishers.

(Q 1-19)
Mr Campbell: Yes, and we have put that model to several societies for whom we publish; and their publications committees are considering it. At this stage none of them have decided to take it any further. We submitted an application for the JISC programme where they have a three-year programme funding some open access experiments. We were unsuccessful with that. We are certainly looking at the model, and we have several proposals out with societies, trying to cost the impact of—
Q6  Chairman: Does this model have an impact on publishers and institutions, in your view?
Mr Campbell: We could talk about that for the next hour.
Q7  Chairman: No, you have not got an hour; you have got one minute.
Mr Campbell: As we said in our submission, we think it will have an impact. We think there is a danger that an author-paid model could lead to lower standards. It is also not popular amongst authors, less well-funded institutes or from other countries, where even a ten-dollar charge to an author would seem excessive.
Q8  Chairman: What effects would open access models have in costing terms, compared to existing publishing models?
Mr Charkin: There are many answers because there are many journals for many disciplines, and the impact will be different depending upon which discipline or which journal you are talking about. In our letter to you, speaking on behalf of Nature Publishing Group, in the case of Nature itself, the British international journal, in order to replace our revenues you would have to charge the author somewhere between £10,000 and £30,000 because the costs of editorial design and support are so high. The reason for the big disparity is how much advertising—
Q9  Chairman: Are you saying it is per article?
Mr Charkin: Per article; it is a huge price and would, I believe, be completely unsustainable because I think people would not pay that. In that particular model it is a very serious and different answer to the one that one would get for a more specialised journal.

PMR: I am becoming increasingly concerned about publishers’ statements that “Societies do not want OA”, “OA author pays does not work”. It is much truer to say that publishers do not want societies to have OA and that they have failed to try to make it work.
Later…

Dr Jarvis: One of the things that intrigues me is that there is evidence that some of the support for open access is coming from outside the research community. There are some reports of members of the public wanting to read this kind of information. Without being pejorative or elitist, I think that is an issue that we should think about very, very carefully, because there are very few members of the public, and very few people in this room, who would want to read this type of scientific information, and in fact draw wrong conclusions from it. As publishers of this very high-level, sometimes esoteric, information, when we have information that is of use to a broader audience, we make sure we use all the channels by contacting the press to make that happen. Having said that, I think the mechanisms are in place for anybody in this room to go into their public library, through inter-library loan, get access to any article they want. They can go to a machine now and press a button and see it on their screen. I don’t believe that a section of our society is excluded from seeing this information. I will say again; let us be careful because this rather enticing statement that everybody should be able to see everything could lead to chaos. Speak to people in the medical profession, and they will say the last thing they want are people who may have illnesses reading this information, marching into surgeries and asking things. We need to be careful with this very, very high-level information.
(continued Q20-)
Q20  Geraldine Smith: That is not what Dr Virginia Barbour is saying, the molecular medicine editor at the Lancet. She feels that patients should be able to access papers about their medical conditions. What are you doing to ensure that patients who are not scientists have access to quality medical journals that could help them have a better understanding of their own illnesses?
Dr Jarvis: As I say, I think the mechanisms really are in place. Members of the general public get access to an article at no cost. They might not get it immediately on their desktop screen at home, which sounds like a good idea, but there is a lot of available information which most of us need to be interpreted. You could get yourself in trouble if you wrongly interpret this kind of information, much of which is arcane.

I had not realised that one of the roles of a publisher was to stop people reading the scientific literature. Of course I see it now. Much of the published medical literature is arcane (“mysterious, secret or obscure; understood only by a few; difficult to understand.” Chambers). Well, who publishes this arcane material? And might they not feel a duty to make it less arcane?
Now that was three years ago, and isn’t it unfair to quote that? Surely they won’t say things like that to the US House will they? No, they have hired a pit-bull PR firm to do it for them. Hopefully the US representatives are as on the ball as Ian Gibson and colleagues.
I hope now you are agitated or even angry enough to write to your representative. For details see Peter Suber’s Blog – there may be last-minute changes. And don’t feel that your voice doesn’t count. In Europe we lobbied against software patents successfully. Politicians are keenly aware that information, the Internet, etc. matter. The smart ones see that Openness is a business opportunity.
And when we win this battle – the NIH – it will have so much impact that it will be difficult to resist the change towards OA.

Posted in open issues, Uncategorized | Leave a comment