Closed access damages peer-review

I talked today with a scientist (R) whom I meet frequently and who works in a leading bioscientific research establishment (not a University, but with Nobel laureate and FRS on the staff). In addition to their day job, R acts as a peer-reviewer for a leader bioscience journal (J). R gets about 1 paper a week from J and has to return a review within 21 days. R does not get paid for their reviewing, which can certainly run into hours per journal. Like many other scientists R does it because it contributes to science. R and J take review seriously and J has given R training in how to review. The reviewing can be seen as a credit on R’s CV.
Some of you may not be familiar with how peer-review in scholarly journals works, so here’s a rough overview. An author  (A) sends a manuscript (M) to an editor (E) on J which decides on which reviewers (R, R1, R2) should review M. R, R1… are normally asked to decide on:

  • whether M is in scope for J
  • whether it reports a significant scientific advance (i.e. not repeating work already known)
  • whether the science – as reported in M – is sound and potentially capable of being reproduced
  • whether A shows that they are aware of published work that impinges on M
  • and in many journals to give some idea of the “score” or “importance” of M. This could be “very important”, “minor advance, but useful”, etc.

R is normally dependent on:

  • the material in M.
  • the references (citations) in M to other work, C1, C2…
  • R’s knowledge of the field from meetings, conversations, reading, etc.

R must keep M confidential but is normally allowed to consult close colleagues in confidence on small matters of fact.
R relies heavily on the material in the references (C1, C2…). These can contain background material, precise recipes, data, closely argued positions, etc. Without C1, C2 it is normally impossible to review the paper.
R tells me that on average there are at least 3 references (C1,C2,C3) per manuscript (M) which are closed and to which R’s institution does not have access. R cannot review the paper responsibly without C1, C2, C3. So what should R do?

  • send M back and say they cannot review it
  • pay the cost (3 * USD30 = 90 USD) for access to these references
  • ask R’s institution to pay for access (why should they?)
  • get an interlibrary loan (takes days and anyway costs money)
  • ask a friend (me) for a copy of C1, C2, because Cambridge subscribes to these closed journals. I have to say no, because that would be a breach of copyright and whatever byzantine conditions the publishers have agreed with our library.
  • do a bad review

THE LACK OF ACCESS TO PUBLICATIONS HARMS PEER-REVIEW
This is a sample of one, but by mathematical induction it applies to all peer-review. Therefore the hidden cost to science of closed access is enormous. Either the reviewers pay 90 USD per paper (which I strongly doubt), or they violate copyright (unthinkable), or they do bad reviews (certainly not) or …
So, closed access publishers, by preventing reviewers (and there are tens of thousands) reading your publications you are damaging peer-review. In a paper age this was accepted – you couldn’t read everything – or it took ages. But in the electronic age it isn’t necessary. We ought to be getting better quicker peer-review because of e-paper.
Are we? and do you care enough to do something about it? If not Open Access (the obvious answer), WHAT?

Posted in open issues | 3 Comments

THANK YOU LIBERTAS ACADEMICA

Avid and continual readers of this blog will remember that some of us in the Blue Obelisk have set out to monitor the posted policy and licenses of “open access” publishers or publishers which offer some “open access” products. We are going systematically, though more slowly that we would have liked (and happy to have committed volunteers) through the public pages of these publishers.
Some are easy – they state simply that they offer CC-BY licenses. Others have more complex pages, which sometimes are inconsistent. We are blogging such instances – hopefully in an objective fashion – and giving the publishers the opportunity to clarify policies.
I commented factually on a journal published by Libertas Academica (“Open Access” at libertas academica). Now we have great news – they (Tom Hill) understand the issue and are making simple and positive changes to their site. I reproduce the mail in full.

Subject: Open Access at Libertas Academica
From: “Tom Hill”

To:

Dear Dr Rust,
Earlier today I read your blog entry on OA at Libertas Academica (http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=415) with great interest. I regret not having done so earlier, although I came upon it as a result of an uncharacteristic bout of Saturday morning corporate navel-gazing by way of a Google search on the company’s name. I have now added your blog’s RSS feed to my bookmarks and I’m sure I’ll become a regular reader.
I very much appreciate your critique on our OA policy. It appears that in our haste to develop our journals we have neglected to make our policy, particularly with respect to copyright, as transparent as it should be.
I have therefore made the following changes:
1. We now clearly apply the CC-BY licence.
2. I have asked our web developer to remove the obsolete copyright statement from the bottom of all web pages. Given that this is Saturday, this change will probably not take place until Monday NZ time.
3. I’ve also asked said developer to fix the link between www.la-press.com/copyright.htm and www.la-press.com/authors.php?content_id=40.
I wonder if you would be willing to communicate the gist of these changes to your readers in some way? Irrespective of this, thanks for your feedback.
Regards,
Tom Hill
____________________________________________
Tom Hill
“Analytical Chemistry Insights” “Biomarker Insights” “Bioinformatics and Biology Insights” “Cancer Informatics” “Clinical Medicine: Arthritis and Musculoskeletal Disorders” “Clinical Medicine: Cardiology” “Clinical Medicine: Circulatory, Respiratory and Pulmonary Medicine” “Clinical Medicine: Oncology” “Drug Target Insights” “Evolutionary Bioinformatics” “Gene Regulation and Systems Biology” “Integrative Medicine Insights” “Perspectives in Medicinal Chemistry” “Translational OncoGenomics”
LIBERTAS ACADEMICA

This is wonderful – and I hope that many of the problem we have will turn out to be simple lack of clarity on web pages.
On our side we hope – in time – to be able to summarise the acccess and re-use rights of all “open access” chemistry publishers. We started with “Analytical Chemistry Insights” because it was the first in the alphabet – so if you are a publisher of chemistry listed on the DOAJ list and your journal is later in the alphabet than “A” and you wish to clarify your website before we get to you … please drop us a note.
In summary this shows dramatically the value of labels.

Posted in open issues | 3 Comments

"open access" to data – let's be precise

In the last post (Reply from softCon on Spectra and “open access”) I report how ICSU (CODATA) use the phrase “open access”:

Here are some quotations from the ICSU report:
“…Full and open access” to data implies equitable,
non-discriminatory access to all data that are of
value for science. It does not necessarily equate to
immediate access or ‘free of cost’ at the point of
delivery, although this is certainly the ideal in many
situations, particularly with regard to publicly
funded data. Data should be made available with
minimal delay but a short ‘privileged access’ period
for original data producers may be justified in some
situations. Excessive charging for data that is by
definition discriminatory against some scientists is
clearly contrary to the principle of full and open
access but some cost-recovery is not necessarily
excluded…”
“…There are several economic models for providing
scientists with access to data for research and education.
They include, among others, (1) free and open access to
research data by scientists, with financial support for data
dissemination and preservation assumed by others,
including government science agencies and private
foundations; (2) open access to scientific data for research
and education for the cost of reproduction (that is,
recovering the operational costs of data dissemination);
(3) free and open access to metadata, and cost-recovery
pricing for data (or data licenses) in order to support the
full data infrastructure. When this last approach is
employed by a commercial company, the financial charges
for data must be sufficient to recover all investment costs
and to make a profit for investors. An important variation
on this includes licensing for scientists to use specific
bodies of data at reduced cost…”

PMR: As as said earlier I have considerable respect for CODATA and if this is their position I know they have laboured hard over preparing it – the content, the intent and the phrasing. So they have chosen “open access” as a descriptive phrase and used it several times. However they make it quite clear that this does not necessarily mean “toll-free”.
Whereas in another branch of ICSU, ICSTI it is very clear that “open access” is used in the sense of BOAI.
Oh dear.
We have a major and committed organisation using phrases in a completely confusing way. So it is not, perhaps, surprising that we do not always make ourselves clear to each other. And occasionally world views collide and raise the heat of debate.
What should we do? I think we have to be more precise about what we are talking about. We need to devise labels that we understand.  And that will be the theme of later posts here. But I wonder if CODATA/CSPR might not consider removing the phrase “full and open access” from its sponsored pay-to-view databases.

Posted in data, open issues | Leave a comment

Reply from softCon on Spectra and "open access"

In recent posts Request for CODATA definition of Open Access– and http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=445 I was concerned about the use of “open access” to describe a pay-to-access database. I had a very useful and constructive reply from softCon about the “open access” database of spectra which I append below, together with my reply. I will comment in a later post…
PMR: Many thanks for your prompt, full and constructive reply. I hope I can make a similarly constructive response. I will embed comments in your text, which is otherwise verbatim.

Dear Dr. Murray-Rust,
thank you for your e-mail concerning the “UV/Vis+ Spectra Data Base”. Please
let me begin my answer with some additional information.
We started the database in August 2000 and in the beginning all data
(spectra and datasheets) are completely free accessible for everyone. We
thought that this would be helpful to convince the users of the database to
help us in maintaining the database and to convince commercial users
(unilever, bayer, basf, pfizer etc) which benefits from the database or
governmental organizations to provide us with financial support, but that
was naive from our side. During the first six months when the database was
on-line we’ve several thousands of users (commercial and non-commercial) but
we have only 2 (TWO!) users which were willing to help us in maintaining the
database by the provision of data and we’ve got no financial support. Due to
this experience, we’ve decided to change our database policy. The database
was subdivided into a complete free-of-charge “Literarure-Service”
(meta-data) and a “Spectra-Service” (spectral data) for which a subscription
is required. You can access the “Spectra-Service” either by supporting us in
maintaining the database (provision of spectra data) or by paying a moderate
annual fee. Currently almost three fourths of the “spectra-service” users
have complete free-of-charge” access. We make no profit with the database.

PMR: I understand and appreciate the business model. I don’t have any concern about charging for data per se.

To maintain a fast growing database is not only a really hard and never
ending work but also cost-intensively. However, to operate and maintain such
a database financial support is required. Both database services are
operated in accordance to the “Open Access” definitions and regulations of
the CSPR Assessment Panel on Scientific Data and Information (International
Council for Science, 2004, ICSU Report of the CSPR Assessment Panel on Data
and Information; ISBN 0-930357-60-4). We’ve added a link to the original
document on our web-site.
Here are some quotations from the ICSU report:
“…Full and open access” to data implies equitable,
non-discriminatory access to all data that are of
value for science. It does not necessarily equate to
immediate access or ‘free of cost’ at the point of
delivery, although this is certainly the ideal in many
situations, particularly with regard to publicly
funded data. Data should be made available with
minimal delay but a short ‘privileged access’ period
for original data producers may be justified in some
situations. Excessive charging for data that is by
definition discriminatory against some scientists is
clearly contrary to the principle of full and open
access but some cost-recovery is not necessarily
excluded…”
“…There are several economic models for providing
scientists with access to data for research and education.
They include, among others, (1) free and open access to
research data by scientists, with financial support for data
dissemination and preservation assumed by others,
including government science agencies and private
foundations; (2) open access to scientific data for research
and education for the cost of reproduction (that is,
recovering the operational costs of data dissemination);
(3) free and open access to metadata, and cost-recovery
pricing for data (or data licenses) in order to support the
full data infrastructure. When this last approach is
employed by a commercial company, the financial charges
for data must be sufficient to recover all investment costs
and to make a profit for investors. An important variation
on this includes licensing for scientists to use specific
bodies of data at reduced cost…”

Thank you very much for this. I may comment later in a blog that it is unfortunately for the “open access” publishing community that ICSU has chosen the phrase “open access” to mean an affordable charge structure”. I accept that by some standards 100EUR is non-discriminatory.

Indeed we are one step ahead to the ICSU recommendations since we provide
free-of-charge access to the meta data/related data without any
cost-recovery and in addition the database user can decide if he is willing
to help us in maintaining the database or to pay a moderate utilization fee
which ensures that the database will be operated, developed and maintained
in the future.
Again, currently all meta-data (datasheets) are free accessible as well as
other related data (e.g. software, satellite-data etc t.b.d.).
Finally, as mentioned in the ICSU report “WHO PAYS – Data production and
management are costly”. We’ve currently no idea how to finance this database
except by charging some of its users with a moderate fee. Do you have any
ideas?

PMR: I agree that data are costly, though technology brings some costs down. With Open Access in the publishing sense there is a strong movement towards author-pays supported-by-funder. The major charities (Wellcome, HHMI) are making allowances for authors to pay for publication as Open Access (toll-free access and hopefully re-use).
I suggest you have a look at what the NIST group (Michael Frenkel and colleagues) have done with ThermoML. Here the publishers have a model where if thermochemistry is published (there are 4 or 5 journals) it has to be in ThermoML and has to go into an Open Access database. This seems to work to everyone’s benefit. I’ll write more later … but this might be a useful model for you. It won’t pay YOU directly but may create a data stream at near zero cost.

PMR: more comments in later post…

Peter, I hope that these information will give you an idea about our
intensions.
Best regards,
Andreas
————————–
Dr. Andreas Noelle
science-softCon
Auf der Burg 4
63477 Maintal
Germany
Phone: +49 6181 498414
Fax: +49 6181 498415
e-mail: andreas.noelle@science-softcon.de
www.s-sc.de

VVV

Posted in data, open issues | Leave a comment

THANK YOU ELSEVIER!

I have had a simple, positive response from Elsevier on my request to access their data robotically. This is really exciting. THANK YOU ELSEVIER. It deserves capitals.

Dear Peter Murray-Rust
Thanks for your email.  Data is not copyrighted.  If you are reusing the
entire presentation of the data, then you have to seek permission,
otherwise, you can use the data without seeking our permission.
Yours sincerely
Jennifer Jones
Rights Assistant
Global Rights Department
Elsevier Ltd
PO Box 800
Oxford OX5 1GB
UK
Tel: + 44 (1) 865 843830
Fax: +44 (1) 865 853333
email: j.jones@elsevier.com
Elsevier is pleased to announce our partnership with Copyright Clearance
Center’s Rightslink service. As from 6 July, Rightslink will handle
Elsevier’s journal permission requests.  With Rightslink (r) it’s faster
and easier than ever before to obtain permission to use and republish
material from Elsevier. Using Rightslink is as simple as:
Simply visit: http://www.sciencedirect.com/ and locate your desired
content.
Click on Permissions within the table of contents or in the tool-box to
the right of the online article to open the following page:
1. Select the way you would like to reuse the content
2. Create an account if you haven’t already
3. Accept the terms and conditions and you’re done
Please contact Rightslink Customer Care with any questions or comments
concerning this service: Copyright Clearance Center Rightslink Customer
Care Tel (toll free): 877/622-5543 Tel: 978/777-9929
E-mail: customercare@copyright.com
Elsevier Limited, a company registered in England and Wales with company
number 1982084, whose registered office is The Boulevard, Langford Lane,
Kidlington, Oxford, OX5 1GB, United Kingdom.
—–Original Message—–
From: peter murray-rust [mailto:pm286@cam.ac.uk]
Sent: 22 July 2007 11:19
To: Rights and Permissions (ELS)
Subject: Permission to extract crystallographic data robotically from
Elsevier publications
Dear Claire Truter,
I and colleagues have built a repository of crystallographic information
published in scientific journals. This data is factual, and not
copyrighted by the original authors. Major publishers such as the
International Union of Crystallography and the Royal Society of
Chemistry encourage (and often demand) the publication of such data as
part of the scientific record and mount it on their sites as “supporting
information” or “supplemental data”. It is of extremely high quality and
over the last 30 years the crystallographic and chemical community have
shown that it is an essential resource for data-driven science – a
concept with the NSF and JISC among other see as a large part of future
science.
We have built robots which have analysed over 50, 000 papers on
publishers’ sites and extracted the crystallography. Note that the major
publishers I have referred to do NOT require a subscription to access
this information. We have agreed protocols whereby our robots run at
times and frequencies that do not cause denial of service
(DOS) – i.e. we try to be responsible.
Elsevier journals do not expose this as public supplemental information
but I believe it is available to toll-access subscribers.I would like
permission to extract crystallographic data from any Elsevier journals
using robotic techniques and to make the TRANSFORMED extracted data
public under  a CC-BY licence (Creative
Commons) or an OpenData license from the Open Knowledge Foundation .
All data so extracted would be referenced through the DOI of the article
thus allowing any user (human or robot) to give full citation and
therefore credit to the authors and the journal.
To help the discussion we note that facts, per se, are not copyrightable
and that the authors do not claim copyright. The data are almost always
direct output from an instrument. We need not store the actual documents
(normally retrieved as IUCr CIF files) as our derived work is a
value-added document in XML-CML which retains none of the creative work
of formatting and pagination in the original.
I am sure you will agree that this is a reasonable request and that
Elsevier as a major scientific publisher would wish to do whatever it
could to foster the birth of a new science.
I am guessing that Elsevier journals (e.g. Tetrahedron, Polyhedron,
etc.) contain a total of ca 20,000 relevant papers – until we are able
to examine them robotically I can’t be more precise. Obviously I cannot
write for permission for each paper individually so I am asking for
general permission to carry out  robotic extraction of crystallographic
data from all Elsevier journals to which I have access through my
institution. And I would obviously agree to devising a robotic protocol
that was friendly to your web server.
If you and colleagues wish to be convinced of the value and quality of
this cyberscience please have a look at
http://wwmm.ch.cam.ac.uk/crystaleye where you can see the aggregated
material from the other publishers. Although we haven’t published the
results formally yet, two graduate students have carried out thousands
of days’ work of theoretical calculations on the data which we believe
have led to new insights into crystal and molecular structure.
I hope that Elsevier will be excited by the new vision and that we can
move rapidly towards extracting this data. Note that the robots operate
on a daily basis and provide news feeds to the community about new
exciting derived data.
Note that this is a public request – I have explained the reasons on my
letter is contained. Since this is a matter of considerable current
public interest I request permission to post your replies – if there is
material that you wish to remain confidential please send a separate
mail to me indicating confidentiality which I will honour.
Peter Murray-Rust
Unilever Centre for Molecular Sciences Informatics University of
Cambridge, Lensfield Road,  Cambridge CB2 1EW, UK
+44-1223-763069
This email is from Elsevier Limited, a company registered in England and Wales with company number 1982084,
whose registered office is The Boulevard, Langford Lane, Kidlington, Oxford, OX5 1GB, United Kingdom.
This is clear, simple, and in line with what I and others currently believe. If we find crystal structures in Elsevier journals as supplemental data our robots will extract them to http://wwmm.ch.cam.ac.uk/crystaleye
I am very pleased to be able to post a constructive response from a publisher. This  blog tries to be fair and only gets upset at restrictive practices – from whatever type of organization.
Posted in chemistry, data | Leave a comment

Request for CODATA definition of Open Access

Followup to http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=445

[Open letter, copied to http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=446]
Dear Drs. Noelle and Hartmann,
science-softCon
Dr. Andreas Noelle
Auf der Burg 4
D-63477 Maintal
Germany
Phone: +49(0)6181 498414
Fax: +49(0)6181 498415
E-Mail: andreas.noelle@science-softCon.de
Internet: www.science-softCon.de
UV/Vis Spectra Data Base SAG
Dr. Gerd K. Hartmann (Speaker)
Auf der Burg 4
D-63477 Maintal
Germany
Phone: +49(0)6181 498414
Fax: +49(0)6181 498415
E-Mail: gerd.hartmann@science-softCon.de
Internet: www.science-softCon.de

I am a chemist interested in open semantic data, especially in chemistry, and have developed an XML approach to the management of molecules, spectra, crystallography, etc. (CML, Chemical Markup Language). I am promoting the concept of “Open Data” (http://en.wikipedia.org/wiki/Open_data) where access to and re-use of data is toll-free (i.e. does not cost money) and is permission-free (there is no need to request permission before re-using the data for any legal purpose). I was therefore very interested in your spectral data at http://www.uv-spectra.de/ which, in principle, I would be interested in converting to CML in the same way as we have done robotically for crystallography: (http://wwmm.ch.cam.ac.uk/crystaleye).
You describe this as “open access” under the definition of CODATA (CSPR) but when I visit the site I find that I am required to pay 100 EUR per year to access the data [1].  What concerns me is the use of the term “open access”, especially with the full authority of CODATA for whom I have enormous respect. The almost universally accepted use of the term “open access” is from one of the Budapest, Bethesda or Berlin declarations (see http://en.wikipedia.org/wiki/Open_access and from (http://www.soros.org/openaccess/read.shtml):

By “open access” to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.

Taken logically this declaration (which is essentially universally accepted) requires that an “open access” resource be free-to-access for everyone and that the data in part or full can be completely re-used without further permission.
I cannot see how a database which requires users to pay for the data [2] can be described as “open access” but I cannot access the CODATA definition to which your refer. I would be extremely grateful if you could supply me with (an electronic copy of) the CODATA definition of “open access” and show how the spectral database was compliant with it.
I have been concerned that the term “open access” is becoming too vague to act as a precise description of access and re-use permissions and there has been a lively discussion on the blog http://wwmm.ch.cam.ac.uk/blogs/blogs/murrayrust/. I would like permission to quote your reply, especially any definitions, and ask that you make it clear what sections of your reply cannot be quoted publicly.
With best wishes – I hope that this database turns out to be “open access” in the sense I have quoted when I would be interested in collaboration.
Peter
[1] (Some metadata “data sheet” appear to be visible and free”)
[2] There is an information-barter mechanism whereby users can pay in spectra rather than cash – and I approve of such schemes – but they are not “open access”
Peter Murray-Rust
Unilever Centre for Molecular Sciences Informatics
University of Cambridge,
Lensfield Road,  Cambridge CB2 1EW, UK
+44-1223-763069

Posted in data, open issues | Leave a comment

Pay-to-view "open access" at CODATA?

Here’s a puzzle – maybe someone can help. A member of the Blue Obelisk has discovered the following site

science-softCon UV/Vis+ Spectra Data Base

Spectral information (gas, liquid and solid phase from EUV-VUV-UV-Vis-NIR) and related data (e.g.  information concerning publications on quantum yield studies or photolysis studies) from published papers.

The science-softCon “UV/Vis+ Spectra Data Base” on-line service was established in August 2000 and is subdivided into a Literature Service (this site) and a Spectra Service (see spectra-info). Both database services are operated in accordance to the “Open Access” definitions and regulations of the CSPR Assessment Panel on Scientific Data and Information (International Council for Science, 2004, ICSU Report of the CSPR Assessment Panel on Data and Information; ISBN 0-930357-60-4). In October 2004 an international “Scientific Advisory Group” (SAG) for the “UV/Vis+ Spectra Data Base” was established. 
On UV/Vis News you will find information concerning the “UV/Vis+ Spectra Data Base”.

PMR: For non-readers of this blog, spectra are data points relating to the identification or properties of chemical compounds. We’ve had some fun arguments about whether Open Data resources such as NMRShiftDB can be of high quality. You know my views. So a new OPEN ACCESS database of spectra! Great!
Off we go to Aromatic Compounds (benzene, naphthalene and friends). We find

science-softCon UV/Vis+ Spectra Data Base

Literature Service: Aromatic Compounds

Substances: Benzaldehyde, Benzene, Bisphenol A, Cresol, Dimethylbenzaldehyde, Dimethylphenol, Estradiol, Ethinyl Estradiol, Ethylbenzene, Naphthalene, Nitrosobenzene, Phenol, Phenyl radical, Styrene, Tolualdehyde, Toluene, Trimethylbenzol, Trimethylphenol, Xylene
The relevant Spectra Data are available (for registered users) at: Spectra Service

PMR: “Registered users”. Not exactly a cuddly term in the Open Access (or Open Data) world. Robots don’t like having to register. They don’t know how to press buttons. But maybe we can scrape soem data after we have got in. If it’s Open Access that should be fine. And we really really need these spectra. So off to registered users to find:

Select the Type of Usage:
Academic Working Group License, 100 EURO/per Year*
Academic Campus License, 200 EURO/per Year*

Commercial Working Group License, 200 EURO/per Year
Commercial Company License, 500 EURO/per Year

A EURO is a monetary unit adopted by many countries in Europe and is worth about sqrt(2.0) dollars. So just to make it clear, this is a pay-to-use “open access” database.
So who is running this? Perhaps they have never heard of Open Access. Here they are:

This database is supported by the CODATA Working Group “UV/Vis+ Spectra Data Base” and operated in accordance to the “Open Access” definitions and regulations of the CSPR Assessment Panel on Scientific Data and Information of the ICS

PMR: I know CODATA well. They are a high-level organisation of the International Council of Scientific Unions. I (and Henry) have published in their Open Access Journal, Data Science. They have done me the honour of inviting me several times to talk on XML, CML, Open Data and so on. They care passionately about data – they see access to data as being a key factor in saving the planet from various things it needs to be saved from. Here is an example:

14.2 Global Information Commons for Science
Ensuring and improving access to scientific data and information has been a long-term interest of ICSU and will continue to be a high priority for the future. However, as described in the PAA report, the internet and www have revolutionised the practice of science and provide new opportunities, as well as threats, to data and information access. In particular, there are conflicting tensions between commercialization of scientific data and moves towards open on-line access for both data and information.
‘Open access’ for science was a particularly hot topic at the World Summit on the Information Society (Geneva, 2003 and Tunis, 2005) in which ICSU was actively involved. As part of the summit process, the ICSU Committee on Data for Science and Technology (CODATA) took the lead in developing a Global Information Commons for Science Initiative (GICSI). This initiative directly responds to the summit action plan and also addresses many of the recommendation in the PAA report, albeit with a limited focus on access issues.
CSPR considered the merits of this proposal in the context of ICSU’s strategic aims in relation to data and information. It was noted that it is unusual, but not unprecedented, for ICSU to co-sponsor programmes with its own Members or Interdisciplinary Bodies, but a strong case for direct ICSU involvement would need to be made.
Committee members expressed support for the proposal, which clearly related to ICSU’s aims with regards to increasing access to data and information. However, it was not clear how this work fitted with CODATA’s future strategy. Potentially, CODATA was a key instrument for delivering on ICSU’s overall goal for scientific data and information but, as noted in the PAA report, it needed to make strategic choices as to its future priorities and direction. It was suggested that the best approach is for ICSU to endorse this GICSI initiative but not co-sponsor it, and for the SCID (see 14.1) to liaise with CODATA on the development of its strategic plan.

PMR: So here we have the clear commitment to access to data and the concept of  open access.
(I am still trying to find out what “CSPR” is after 10 web pages. In all the documents it seems to be assumed that the whole human race understands “CSPR”). Here’s the relevant page:

Working Group: UV/Vis+ Spectra Data Base

As approved by the CODATA Executive Committee Meeting, Paris, April 2005
Renewal approved by the CODATA Executive Committee, Paris, April 2007

View up-to-date information at http://www.uv-spectra.de/

In April 2005 the CODATA Working Group “UV/Vis+ Spectra Data Base” was established by the CODATA Executive Committee (Committee on Data for Science and Technology of the International Council for Science, ICSU) under the chairmanship of the CODATA President Prof. Dr. Shuichi Iwata.

The work of the CODATA working group is based on the “UV/Vis Spectra Data Base” (www.uv-spectra.de) which is on-line since August 2000. The “UV/Vis Spectra Data Base” is operated as an Open-Access database in accordance with the definitions and the regulations of the CSPR Assessment Panel on Scientific Data and Information [1].

The “+” in the name of the new database stands for the extension of the wavelength range (EUV-NIR), the provision of software and the integration of additional spectral data and information (e.g. SUMER Spectral Atlas of Solar Coronal Features, Dr. Curdt, MPI für Sonnensystemforschung, Katlenburg-Lindau; Daily Solar Spectral Irradiances, 1975-2004, Dr. Lean, Naval Research Laboratory, Washington DC, USA; etc.)

Another important task of the working group is the data and information rescue. Many types of data, including extant “historical” data, are not being used for research because they are not available in digital formats and they are in danger of being lost.

There are both resource and technical limitations to data access in many parts of the world that not only make it difficult to conduct research, but also interfere with the collection of new data. This problem, which is called “digital divide”, is most evident in low- and mediumincome (industrialized) countries. The task of the CODATA working group is to reduce these limitations.

[…]

 [1] International Council for Science, 2004, ICSU Report of the CSPR Assessment Panel on Data and Information; ISBN 0-930357-60-4

PMR: So I am trying to find the definition of “open access” from CODATA. Unfortunately it seems to be buried in a paper report – I can’t find a web page easily. Nor can I find out what CSPR is. I shall have to adhere to what it says as they work very hard on their declarations.  If it turns out that a data base that requires payment is “open access” and that this terminology has been signed off by the Scientific Unions then maybe it’s time for a term.
Like Open Data?
[I have finally found that CSPR = Committee on Scientific Planning and Review – obvious when you know]

Posted in data, open issues | 3 Comments

Sparkies

SPARC (the Scholarly Publishing and Academic Resources Coalition) is a major force in liberating information. Here is one of its ways – a prize for video-aware students.

FOR IMMEDIATE RELEASE
July 25, 2007
Contact:
Jennifer McLennan
(202) 296-2296 x 121
jennifer [at] arl [dot] org

CALL FOR ENTRIES
SPARC Announces Mind Mashup:
A Video Contest to Showcase Student Views on Information Sharing
Wikipedia Founder Jimmy Wales and Documentary Filmmaker Peter Wintonick Among Judges Selecting $1,000 Prize Winner
Washington, DC – July 25, 2007 – SPARC (the Scholarly Publishing and Academic Resources Coalition) today announced the launch of the first annual SPARC Discovery Awards, a contest to promote the open exchange of information. Mind Mashup, the theme of the 2007 contest, calls on entrants to illustrate in a short video the importance of sharing ideas and information of all kinds. Mashup is an expression referring to a song,
video, Web site or software application that combines content from more than one source.
Consistent with SPARC’s mission as an international alliance of academic and research libraries promoting the benefits of information sharing, the
contest encourages new voices to join the public discussion of information policy in the Internet age. Designed for adoption as a college or high
school class assignment, the SPARC Discovery Awards are open to anyone over the age of 15.
Contestants are asked to submit videos of two minutes or less that imaginatively show the benefits of bringing down barriers to the open
exchange of information. Submissions will be judged by a panel that includes:
* Aaron Delwiche, Assistant Professor in the Department of Communication at Trinity University in San Antonio, Texas
* José-Marie Griffiths, Professor & Dean at the School of Information and Library Science, University of North Carolina at Chapel Hill
* Rick Johnson, communications consultant and founding director of SPARC
* Heather Joseph, Executive Director of SPARC
* Karen Rustad, president of Free Culture 5C and a senior at Scripps College majoring in media studies
* Jimmy Wales, founder of Wikipedia
* Peter Wintonick, award-winning documentary filmmaker and principal of Necessary Illusions Productions Inc.
“I’m very proud to be judging this contest,” said Karen Rustad. “When it comes to debates over Internet information policy, students are usually
subjects for study or an object for concern. I can’t wait to see what my contemporaries have to say about mashup culture and open access to
information once they’re given the mike — or, rather, the camera.”
The contest takes as its inspiration a quote from George Bernard Shaw: “If you have an apple and I have an apple and we exchange these apples then you and I will still each have one apple. But if you have an idea and I have an idea and we exchange these ideas, then each of us will have two ideas.”
Submissions must be received by December 2, 2007. Winners – including a first-place winner and two runners up – will be announced in January 2008. The winner will receive $1,000 and a “Sparky Award.” The runners up will each receive $500. Winning entries will be publicly screened at the American Library Association Midwinter Conference in January 2008 in Philadelphia and will be prominently featured in SPARC’s international advocacy and campus education activities.
For further details, please see the contest Web site at

________________________

SPARC (Scholarly Publishing and Academic Resources Coalition), with SPARC Europe and SPARC Japan, is an international alliance of more than 800 academic and research libraries working to create a more open system of scholarly communication. SPARC is a founding member of the Alliance for Taxpayer Access, a coalition of patient, academic, research, and publishing organizations that supports open public access to the results of federally
funded research – including research funded by the National Institutes of Health. SPARC is on the Web at http://www.arl.org/sparc/.
________________
Alison Buckholtz
SPARC Consultant
phone: 202 251 7845

So any budding directors – here is your chance. Even if you don’t win, making videos is fun and gets you well known. (The videos we have put out from the Centre – or had made by others – have been widely accessed.) Sounds like a great opportunity for an undergraduate chemistry community.

Posted in Uncategorized | Leave a comment

Funders – please make sure what you are paying for

The saddest thing about the HHMI-Elsevier deal is that HHMI didn’t appear to know what they were paying for or didn’t care. By doing less than due diligence they have moved the goalposts[1] in the wrong direction and made it that bit harder to get a proper deal for “OA”. From Peter Suber’s blog (copied in full):


Tom Cech and four co-authors, A Reply from HHMI, Journal of Cell Biology, July 24, 2007. A letter to the editor. This is a response to the editorial by Mike Rossner and Ira Mellman in the June 11, 2007, issue of JCB, How the rich get richer.
From the HHMI letter:

The Howard Hughes Medical Institute (HHMI) recently announced a policy on Public Access to Publications for its investigators and Janelia Farm Research Campus scientists. This policy requires our scientists to publish in only those journals that make original research articles and supplemental materials freely accessible through a public database within six months of publication.
The policy seeks to balance the goal of public access and the equally important value of scholarly freedom—the goal of our scientists to allow their graduate students and postdoctoral fellows to publish their work in the journal of their choice. To bring more journals into compliance with our policy, we have concluded agreements with Elsevier and Cell Press, as well as other publishers, including the American Society of Hematology. Such conversations will continue with both for-profit and non-profit publishers.
Rossner and Mellman have criticized HHMI for not using its influence to coerce Elsevier into making their content public after a short delay without compensation. It should be noted that the $1,000 we are paying for each Cell Press article and $1,500 for other Elsevier publications is not profit to the publisher, but a reimbursement for their lost revenue in providing accelerated free access and their time and effort in uploading HHMI manuscripts to PubMed Central. Furthermore, HHMI already makes payments at a similar level to a wide array of non-profit and for-profit publishers for immediate or accelerated access to publications, as does the Wellcome Trust….

From the response by Mike Rossner (Executive Director, The Rockefeller University Press) and Ira Mellman (Editor in Chief, Journal of Cell Biology):

It seems clear from the HHMI response that they missed the point of our Editorial. They note that they are providing public access to HHMI-funded research with their outlay of cash to publishers (both commercial and non-commercial). This fact was not in dispute.
They do not, however, address the effect of their actions on the public access movement—that is, the effort to get publishers (especially commercial publishers, who have refused to release the bulk of their content to the public) to provide public access to their holdings after a short delay. If the Rockefeller University Press does not need reimbursement to provide free access after 6 months, neither should other publishers. Elsevier already makes vast sums of money publishing publicly funded research, and they should feel an obligation to give something back to the public. Paying publishers to provide spotty access to just a few of the papers they publish (e.g., those authored by HHMI investigators) does not address the issue of public access to all of the scientifi c literature. HHMI had an opportunity to exert some pressure on publishers to achieve that goal, and they chose not to do so. Although they claim they were trying to find a balance between public access and “scholarly freedom,” they did not succeed. Instead, the public access movement has suffered because HHMI gave in to the selfi sh desire of some of their investigators to continue publishing in Cell. This serves neither the public, nor science.

Comment. Rossner and Mellman are right. In my evaluation of the HHMI-Elsevier deal in the April issue of SOAN, I responded to some of the HHMI points that Rossner and Mellman did not address:

…[D]eposit in a repository is a clerical task whose cost is negligible….The job is not worth thousands of dollars per paper, or hundreds, or even tens. If the physical job of depositing papers is really what HHMI wanted, it should have put the job up for bidding. It could have gotten a much better deal….
Under the HHMI deal, Cell Press will reduce its permissible embargo on OA archiving from 12 months to six. That’s a real concession and gave Elsevier a bargaining chip in the negotiation. But Elsevier journals outside Cell Press already permitted immediate self-archiving and the HHMI deal will lengthen the embargo to six months (for HHMI-funded authors), moving the bargaining chip back to HHMI….
[The Wellcome Trust] and Elsevier struck a deal last September…but [WT] got more for its money. WT got immediate OA, while HHMI is getting embargoed OA. WT got OA to the published edition, while HHMI is getting OA to an unedited edition. WT got a Creative Commons license or equivalent; while HHMI could use CC licenses on deposited, unedited manuscripts, the published editions will remain under Elsevier’s copyright with no significant reuse rights….

PMR: The last paragraph from PeterS is the key one – WT has been blazing the trail by requiring value for its money. They are quite prepared to pay for OA – and to try to work out what is reasonable – they don’t want the publishers to go bankrupt after all this hard work. But they have stipulated full OA – not just eyeballs but IOStreams. We pay – we read – we re-use. That is the gold standard. For advocates of data-re-use it was exactly what we wanted – a clear distinction between what we could do and what we couldn’t.
It’s impossible for most of us non-experts to work out what HHMI have actually paid for. Can I send my robots over papers in Cell that HHMI have paid for? Elsevier is not (yet) one of the publishers who have closed down Cambridge’s access for my (perfectly legal) activities. But I’m not sure my colleagues want me to try to experiment. And will the web pages on Elsevier’s web site tell me what I can and cannot do with HHMI papers? num! But the worst aspect is that not only have the goal posts been moved unfavourably, the signal has been given that if publishers move them again, then they can get away with it.
[1] apparently this phrase is quite recent (http://www.phrases.org.uk/meanings/251400.html)
Posted in open issues | Leave a comment

Key Perspectives on Data

Alma Swan is a well-known and respected consultant and investigator in the area of “the scholarly publication industry” and runs a blog (Key Perspectives) where she reports:

The increasing importance of data
NEW STUDY on the publication and quality assurance of research data outputs
The volume of data output from scholarly research is growing rapidly. This brings to the fore a whole range of issues about how data is created, used, assessed and maintained. A new study, funded jointly by RIN, NERC and JISC will investigate the following areas: (i) the role that data outputs currently play alongside or as an alternative to conventional publications in the reseach communication process; (ii) the nature and range of arrangements for making research data as widely available as possible; (iii) current practice for ensuring the quality of data. The study will be guided by some of the foremost scholarly data experts in the UK and will be completed by the end of 2007. Key Perspectives is delighted to have been selected to work on such a timely and important project.

I don’t want to steal Alma’s thunder – that’s why she has a blog – but just to report that she has done me the honour of asking me to be a consultant on this project. On Monday she travelled to Cambridge despite the west country floods and we spent 2 hours discussing the major issues and the people who would be best placed to give informed opinion. We are going to continue talking – which is always a pleasure – for the next few months.
This study is really important and Alma is very well placed to make the case for data. She has selected a number of disciplines – including crystallography – and will be looking at many aspects of policy and practice. She is fully aware of (and even fuller after Monday) of the importance that I attach to getting scientific data out freely to the scientific community.
I’ll let her tell the rest of the story as it unfolds.

Posted in data, open issues | Leave a comment