SCOAP3 Open Access publishing venture. And the community will need somewhere to find the publications – so that is where repositories come in. There is no question that High-energy physics (HEP) needs its own domain repository. The coherence, the specialist metadata, the specialist data for re-use. HEPhysicists will not go to institutional repositories – they have their own metadata (SPIRES) and they will want to see the community providing the next generation. And we found a lot of complementarity between our approaches to repositories – as a matter of necessity we have had to develop tools for data-indexing, full-text mining, automatic metadata, etc. But where do sciences such as chemistry, materials, nanotech, condensed matter, cell biology, biochemistry, neuroscience, etc. etc. fit? They aren’t “big science”. They often have no coherent communal voice. The publications are often closed. There is a shortage of data. But there are a LOT of them. I don’t know how many chemists there are in the world who read the literature but it’s vastly more than the 22,000 HEP scientists. How do we give a name to this activity. “Small science” is not complementary; “lab science” describes much of it it but is too fixed to buildings. Jim Downing cam up with the idea of “Long Tail Science”. The Long Tail is the observation that in the modern web the tail of the distribution is often more important than the few large players. Large numbers of small units is an important concept. And it’s complimentary and complementary. So we are exploring how big science and long-tail science work together to communicate their knowledge. Long-tail science needs its domain repositories – I am not sanguine that IRs can provide the metalayers (search, metadata, domain-specific knowledge, domain data) that are needed for effective discovery and re-use. We need our own domain champions. In bioscience it is provided by PubMed. I think we will see the emergence of similar repositories in other domains. I am on the road a lot so the frequency (and possibly intensity) of posts may decrease somewhat…
Archive for the ‘open issues’ Category
From Peter Suber More on the NIH OA mandate.
Many points but I pick one:
Jocelyn Kaiser, Uncle Sam’s Biomedical Archive Wants Your Papers, Science Magazine, January 18, 2008 (accessible only to subscribers). Excerpt:If you have a grant from the U.S. National Institutes of Health (NIH), you will soon be required to take some steps to make the results public. Last week, NIH informed its grantees that, to comply with a new law, they must begin sending copies of their accepted, peer-reviewed manuscripts to NIH for posting in a free online archive. Failure to do so could delay a grant or jeopardize current research funding, NIH warns…. [...] Scientists who have been sending their papers to PMC say the process is relatively easy, but keeping track of each journal’s copyright policy is not….
PMR: Exactly. It should be trivial to find out what a journal’s policy is. As easy as reading an Open Source licence. An enormous amount of human effort is wasted – authors, repositarians, on repeatedly trying to (and often failing to) get this conceptually simple information.
I’ve been doing article and interviews on OA and Open Data recently and one thing that becomes ever clearer is that we need licences or other tools. Labeling with “open access” doesn’t work.
PMR: It a reasonably balanced article, touching many of the efforts mentioned in this blog. It’s under no illusions that this won’t be easy. I’ve just finished doing an interview where at the end I was asked what we would be like in 5 years’ time and I was rather pessismistic that the current metrics-based dystopia would persist and even get worse (The UK has increased its efforts on metrics-based assessment in which case almost any innovation, almost by definition, is discouraged). But on the other hand I think the vitality pf @2.0@ in so many areas may provide unstoppable disruption.
I’m way behind on this, but anyway: a while back, writer Mitch Waldrop interviewed me and a whole bunch of other people interested in (what I usually call) Open Science, for an upcoming article in Scientific American. A draft of the article is now available for reading, but even better — in a wholly subject matter appropriate twist, it’s also available for input from readers. Quoth Mitch:Welcome to a Scientific American experiment in “networked journalism,” in which readers — you –get to collaborate with the author to give a story its final form.The article, below, is a particularly apt candidate for such an experiment: it’s my feature story on “Science 2.0,” which describes how researchers are beginning to harness wikis, blogs and other Web 2.0 technologies as a potentially transformative way of doing science. The draft article appears here, several months in advance of its print publication, and we are inviting you to comment on it. Your inputs will influence the article’s content, reporting, perhaps even its point of view.
- I’ve done a long interview on Open Data which should be public fairly soon
- I converted the Serials Review article into Word and that has now been submitted. I have also submitted it to Nature Precedings and that should be available in a day or so.
- I have finalised the prrofs for the Nature “Horizons” article (whose preview is on Nature Precedings). The house style seems to be to remove all names from the text and further reading and I am not allowed to acknowledge people by name. This makes the article read in a very ego-ecentric style which does not reflect on the communal nature of the exercise. It appears in early Feb
Although I am mainly concerned with campaigning for data associated with schoilarly publishing to be Open, the term Open Data has also been used in conjunction with personal data “given” or “lent” to third parties (see Open Data – Wikipedia) which contains Jon Bosak’s quote “I want my data back”). Here is a good example of the problems of getting one’s personal data (and possibly other people’s) back from Paul Miller of Talis: Scoble, Facebook, Plaxo, open data; time for change?. Excerpts (read the whole post for the details)
I am of course talking, like so many others, about Robert Scoble being barred from Facebook for using an as-yet unlaunched capability of Plaxo that clearly and unambiguously breached Facebook’s Terms and Conditions. It all began with a ‘tweet’ from Robert Scoble, about the time that post-holiday blues kicked in for those returning to work this (UK) morning;“Oh, oh, Facebook blocked my account because I was hitting it with a script. Naughty, naughty Scoble!”Twitter exploded, closely followed by large chunks of the blogosphere. … Minutiae aside, the whole affair raises a couple of points pertinent to one of the biggest issues for 2008; ownership, portability and openness of data.
- I want to be able to take my data from a service such as Facebook, and use it somewhere else. That’s what Marc Canter has been arguing forever, along with the AttentionTrust, OpenSocial (to a degree), DataPortability.org and many more. That’s part of the rationale behind all the work we’ve been doing on the Open Data Commons, too. However, whether I want to or not, doing it the way Scoble did is a breach of the terms and conditions of Facebook; terms and conditions to which I – and he – signed up when we chose to use the site. If you don’t like the terms, don’t use the service. It’s as simple as that;
- Even were I allowed to export ‘my’ data, there’s a fuzzy line between that which is mine and that which isn’t. The fact that I am a Facebook friend with Nova Spivack certainly should be mine to take wherever I choose. The contact details Nova chooses to surface to me as part of that relationship, however? Are they mine to take with me, or his to control where I can surface them? There’s clearly work to do there, although it’s interesting that ‘even’ people such as Tara Hunt are reacting (also on Twitter, of course) with;“I’m appalled that someone can take my info 2 other networks w/o my permission. Rights belong 2 friends, too.”
PMR: I have no additional comments on this other than to say it’s going to take hard work, forethought to anticipate problems of this sort and probably a lot of legal work. Kudos to Paul and Talis and their collaborators for helping in these general areas.
In science it’s easy. Our data are ours. They don’t belong to Wiley, ACS, Elsevier, Springer. I’ve just finished a paper on this which you should all see shortly.
We want our data back.
And in future we want to make sure we don’t give away our rights to them. Is that a simple message for 2008?
PMR: This is highly commendable, especially from someone early in their career. Some comments:
I don’t usually do New Year’s resolutions. But in the spirit of the several posts from people looking back and looking forwards I thought I would offer a few. This being an open process there will be people to hold me to these so there will be a bit of encouragement there. This promises to be a year in which Open issues move much further up the agenda. These things are little ways that we can take this forward and help to build the momentum.
- I will adopt the NIH Open Access Mandate as a minimum standard for papers submitted in 2008. Where possible we will submit to fully Open Access journals but where there is not an appropriate journal in terms of subject area or status we will only submit to journals that allow us to submit a complete version of the paper to PubMed Central within 12 months.
- I will get more of our existing (non-ONS) data online and freely available.
- Going forward all members of my group will be committed to an Open Notebook Science approach unless this is prohibited or made impractical by the research funders. Where this is the case these projects will be publically flagged as non-ONS and I will apply the principle of the NIH OA Mandate (12 months maximum embargo) wherever possible.
- I will do more to publicise Open Notebook Science. Specifically I will give ONS a mention in every scientific talk and presentation I give.
- Regardless of the outcome of the funding application I will attempt to get funding to support an international meeting focussed on developing Open Approaches in Research.
- In some subjects it’s hard to find Open Access journals whose scope covers the work. That’s very true of chemistry, and there is some sacrifice required. However, there is a high-risk investment here – publish in an OA journal and you are likely to get higher publicity than from a non-OA journal of similar standing. Senior faculty (like me) must promote the idea that it’s what you publish rather than where you publish that matters. All journals start small, but many grow, including OA ones.
- ONS. This is technically hard in many areas. At this stage the effort is as important as the achievement – get as much online as you can afford. But complex internal workflows do not lend themselves to ONS easily and we certainly need a new generation of tools
- I don’t know of any funders who explicitly forbid ONS (other than for confidentiality, etc.) Funders should not be concerned about where the work is published, only that it is reviewed and reasonably visible. Funders certainly shouldn’t dictate the proposed journal and that’s the only obvious mechanism for forbidding ONS
- Obviously I hope the application succeeds and we shall be there
PMR: Whatever the rights and wrongs of this approach – I accept PeterS’s analysis of most situations – it represents one of my fears – the increasing complexity of per-publisher offerings. Springer now has at least 3 models – Closed, OpenChoice and FreeOnlineAccess. Even for the expert it will be non-trivial to decide what can and cannot be done, what should and should not be done. If all the major closed publishers do this, each with a slightly different model where the licence matters, we have chaos. This type of licence proliferation makes it harder to work towards common agreements for access to data (it seems clear that the present one is a step away from Open Data). I used to think instrument manufacturers were bad, bringing out a different data format with every new machine. I still do. Now they have been joined by publishers.
Neuroethics is a new peer-reviewed journal from Springer. Instead of using Springer’s Open Choice hybrid model, it will offer free online access to all its articles, at least for 2008 and 2009. The page on instructions for authors says nothing about publication fees. It does, however, require authors to transfer copyright to Springer, which it justifies by saying, “This will ensure the widest possible dissemination of information under copyright laws.” For the moment I’m less interested in the incorrectness of this statement than in the fact that Springer’s hybrid journals use an equivalent of the CC-BY license. It looks like Springer is experimenting with a new access model: free online access for all articles in a journal (hence, not hybrid); no publication fees; but no reuse rights beyond fair use. The copyright transfer agreement permits self-archiving of the published version of the text but not the published PDF. Also see my post last week on Springer’s new Evolution: Education and Outreach, with a similar access policy but a few confusing wrinkles of its own.
… a coalition of patient, academic, research, and publishing organizations that supports open public access to the results of federally funded research. The Alliance was formed in 2004 to urge that peer-reviewed articles stemming from taxpayer-funded research become fully accessible and available online at no extra cost to the American public. Details on the ATA may be found at http://www.taxpayeraccess.org.for its campaigning for the NIH bill. From the ATA site:
The provision directs the NIH to change its existing Public Access Policy, implemented as a voluntary measure in 2005, so that participation is required for agency-funded investigators. Researchers will now be required to deposit electronic copies of their peer-reviewed manuscripts into the National Library of Medicine’s online archive, PubMed Central. Full texts of the articles will be publicly available and searchable online in PubMed Central no later than 12 months after publication in a journal. “Facilitated access to new knowledge is key to the rapid advancement of science,” said Harold Varmus, president of the Memorial Sloan-Kettering Cancer Center and Nobel Prize Winner. “The tremendous benefits of broad, unfettered access to information are already clear from the Human Genome Project, which has made its DNA sequences immediately and freely available to all via the Internet. Providing widespread access, even with a one-year delay, to the full text of research articles supported by funds from all institutes at the NIH will increase those benefits dramatically.”PMR: Heather Joseph -one of the miain architects of the struggle – comments:
“Congress has just unlocked the taxpayers’ $29 billion investment in NIH,” said Heather Joseph, Executive Director of SPARC (the Scholarly Publishing and Academic Resources Coalition, a founding member of the ATA). “This policy will directly improve the sharing of scientific findings, the pace of medical advances, and the rate of return on benefits to the taxpayer.”PMR: Within the rejoicing we must be very careful not to overlook the need to publish research data in full. So, as HaroldV says, “the Human Genome Project [...]made its DNA sequences immediately and freely available to all via the Internet”. This was the essential component. If only the fulltext of the papers are available the sequences could not have been used – we’d still be trying to hack PDFs for sequences. So what is the 29 USD billion? I suspect that it’s the cost of the research, not the market value of the fulltext PDFs (which is probably much less than $29B ). If the full data of this research were available I suspect its value would be much more than $29B. So I have lots of questions and hope that PubMed, Heather and others can answer them
- what does $29B represent?
- will PubMed require the deposition of data (e.g. crystal structures, spectra, gels, etc.)
- if not, will PubMed encourage deposition?
- if not, will PubMed support deposition?
- if not, what are we going to do about it?