Monthly Archives: August 2007

PRISM: should I worry?

The last week has seen a spate of immediate reaction to the newly formed PRISM - the (American?) publishers' lobby to destroy non-commercial open access.  There is so much (germane) comment that there is no need for me to duplicate it, so I try to add a new aspect here - should I care?

Here are Peter Suber's (almost daily) collections of rebuttals of PRISM's position, "facts" and "logic". If you are starting from scratch read these from PeterS:

PMR: So far there does not appear to be any response from PRISM. I am assuming that PeterS would immediately post any information whether firsthand or from another blog. So I'll assume there hasn't been any - and I don't expect any.
PMR: Here are the two posts which I found closest to interpreting the motivation and strategy of PRISM, which are critical if we are to work out how to react.
[1]From John Dupuis at Confessions of a Science Librarian:
  • ...I would like to talk a little about the makeup of The Executive Council of the Professional & Scholarly Publishing Division [which launched PRISM].

  • Who are the members of this Committee? Sure, the usual suspects, representatives of the major commercial publishers such as a bunch from Elsevier, John Wiley & Sons, McGraw Hill, Wolters Kluwer Health, Springer Science + Business Media, SAGE Publications, ISI Thomson Scientific....Given that they are for-profit companies, however, it's not surprising that they would act to protect their profits....Thank god, you're thinking, that the list above does not include any representatives from scholarly or professional societies. Surely they must understand the importance of free and open access to information, something which can surely only benefit their members, scholarship and society as a whole. Sadly, the Exec Committee also includes members from the IEEE (2, including the chair of the journals committee), American Chemical Society (2, including the chair), American Society of Clinical Oncology, New England Journal of Medicine, Columbia University Press, MIT Press, American Academy of Pediatrics, American Institute of Physics and University of Chicago Press. Unfortunately, scholarly societies see OA as a threat to the income from their publishing programs, which is used to finance all the other membership programs that they have like conferences and continuing education. It's really unfortunate that they can't see past these concerns to what the true interest of their members is: for their research to have as high an impact as possible and, as a byproduct of that impact, to benefit scholarship in their discipline and, hopefully, society as a whole as much as possible....

[2] John Blossom, PRISM Promotes the Interests of Scientific Publishers: Is it Better to Lobby or to Change? ContentBlogger, August 29, 2007. Excerpt:

Wired Science has the most in-your-face coverage of the formation of PRISM, an advocacy group formed by scholarly publishers to stem the legislative movement towards free access to government-funded scholarly research. This in and of itself is not a surprise, but Wired claims that the site is an example of astroturf advocacy, meaning an organization that tries to position itself as a grass-roots movement when in fact it is created by others wanting to appear to have grass roots support. PRISM is the creation of the Association of American Publishers, so one assumes that the roots of this organization are more likely to grow in the yards of scholarly publishers than the scientists providing the research....

The primary problem with PRISM is that it seems to be advocating on a range of issues which, while valid in their own right, are more about fear, uncertainty and doubt - those familiar sales tools - than the real issues at hand....

[The claim that OA will undermine peer review] seems to be somewhat disingenuous, in that there may be alternative methods for supporting effective peer review that have not been explored by scientific publishers. Certainly a government-mandated publishing of research for free that doesn't take into account how that research is produced has the potential to be an unfunded mandate that could place an undue burden on scientific publishers. This is a real issue, but the answers to the issue may not lie with the government itself - they may lie with addressing how the peer review process is funded in general....

Surely politics should stay out of science, but there's no indication at this time that the government would have the ability to influence the peer review process politically through these proposed [OA] mandates any more than it does today....

If the purpose of PRISM is to convince legislators that there is an advocacy group that supports the publishers' goals then my sense is that they are going to fail. The site is not very convincing and lacks information about its supporters or any input from them that would influence people into thinking that there is a broad base of support for PRISM's views. PRISM does raise some important issues that need to be addressed in the rush to make access to government-funded research public, especially in how to support the peer review process realistically in an era in which public access to research is becoming a given. But the broader outlines of the solutions to many of these problems would seem to lie in how the scholarly publishing community has resisted changes in publishing technologies that disrupt their traditional business models.

With some added focus and some sponsorship of honest debate between government research sponsors, scientists and publishers PRISM may yet serve a positive and constructive purpose as an advocacy group. But if PRISM remains little more than an "astroturf" organization that defends the commercial interests of publishers then it's not likely to gain the needed respect from any of the parties that it needs to influence in this debate. Publishers in general are reluctant to engage their markets in a more conversational manner, but if scholarly publishers can position PRISM as a tool to build an honest conversation about the future of commercial and non-commercial scholarly publishing then they may be able to make some headway. At the moment I wouldn't bet on that happening, but you never know.

PMR: The first thing to understand, especially for non-Americans such as me, is that this is in large part an American activity. Agreed that there are several large mulitnational publishers who are not strictly American, but in general it's highly US-centric.

This type of activity is not new and for those of us who tackled the issues with Pubchem will have seen rhetoric such as  from Rudy Baum: C&EN [Amer. Chem. Soc.]: Editor's Page - Socialized Science [2004]

National Institutes of Health director Elias A. Zerhouni seems hell-bent on imposing an "open access" model of publishing on researchers receiving NIH grants. His action will inflict long-term damage on the communication of scientific results and on maintenance of the archive of scientific knowledge.

More important, Zerhouni's action is the opening salvo in the open-access movement's unstated, but clearly evident, goal of placing responsibility for the entire scientific enterprise in the federal government's hand. Open access, in fact, equates with socialized science.

Late on Friday, Sept. 3, NIH posted its proposed new policy on its website, setting in motion a 60-day public comment period (C&EN, Sept. 13, page 7). Under the policy, once manuscripts describing research supported by NIH have been peer reviewed and accepted for publication, they would have to be submitted to PubMed Central, NIH's free archive of biomedical research. The manuscripts would be posted on the site six months after journal publication.

Many observers believe that, if the NIH policy takes effect, other funding agencies will quickly follow suit. In short order, all research supported by the federal government would be posted on government websites six months after publication. This is unlikely to satisfy open-access advocates, who will continue to push for immediate posting of the research.

I find it incredible that a Republican Administration would institute a policy that will have the long-term effect of shifting responsibility for communicating scientific research and maintaining the archive of science, technology, and medical (STM) literature from the private sector to the federal government. It's especially hard to understand because access to the STM literature is more open today than it ever has been: Anyone can do a search of the literature and obtain papers that interest them, so long as they are willing to pay a reasonable fee for access to the material.

What is important to realize is that a subscription to an STM journal is no longer what people used to think of as a subscription; in fact, it is an access fee to a database maintained by the publisher. Sure, many libraries still receive weekly or monthly copies of journals printed on paper and bound as part of their subscription. Those paper copies of journals are becoming artifacts of a publishing world that is fast receding into the past. What matters is the database of articles in electronic form.

[...]

Which is, I suspect, the outcome desired by open-access advocates. Their unspoken crusade is to socialize all aspects of science, putting the federal government in charge of funding science, communicating science, and maintaining the archive of scientific knowledge. If that sounds like a good idea to you, then NIH's open-access policy should suit you just fine.

I have not posted this in full as it's copyright, but I have given the weblink and I am sure its author would wish as many people as possible to read it. I suspect it will echo the thoughts and motivations of the other PRISMites. It is significant that the terminology used here "private sector", "socialized science", "putting the federal government in charge" closely echoes the PRISM language.
So what is PRISM's purpose? I suspect it is primarily to lobby the political process in the US to put pressure on the NIH to withdraw or moderate its support for Open Access. (I cannot envisage they are going to convince the Wellcome Trust to stop funding "junk science" by engaging in Socratic debate. Indeed I don't think PRISM care anything for the scientific community except as a source of revenue. ) What they intend to do is use their junk facts and arguments to convince congressmen and governors in the US to support their cause.

Should I care? Yes, because we cannot afford to lose any battles. This will be hard fought and probably dirty and it may not be easy to see when and where the lobbying is. See this newsreport about a Governor overstepping the line. If, as we managed in the Pubchem struggle for Open Data, we are able to convince the US politicians that Open Access is for the benefit of us all then we make the next step easier. And if we lose it gets harder. OA is a series of skirmishes.
So by all means demolish their arguments and provide our own. But also keep a very close watch.

In conclusion I see no need for any non-American publishers to take the slightest involvement in PRISM. It is so clearly a US political lobbying organisation without other substance that a mid-rank, especially society, publisher would have nothing to gain and considerable reputation to lose. I very much hope that the people I know in the publishing industry will try to dissuade their organizations to steer clear.
Unfortunately I now have the feeling that battle lines have been drawn. I had hoped there might be a gentle evolution (if far too slow) towards more modern approaches. Now I think we see a fracture line.

berlin5 : Open Access to Research Data: surmountable challenges

This is the abstract I have submiitted for the Berlin-5 meeting : “Berlin 5 Open Access: From Practice to Impact: Consequences of Knowledge Dissemination”
Open Access to Research Data: surmountable challenges

Many scientists and organisations have recognised the power and importance of "Data-driven Science" where existing data is a primary resource in scientific research. In some communities (astronomy, particle physics, and some biosciences) this type of work flourishes and the primary challenges are technical - size, complexity, metadata, automation, etc. In many fields however, and almost all multidisciplinary endeavours the major obstacle is finding scattered, heterogeneous data. Many of the data first occur in scholarly publications and, while they can be interpreted and understood in low volume by humans, are poorly presented for re-use by machines. As an example, over 1 million new chemical compounds are published yearly, but are scattered through hundreds or thousands of journals.

In principle this could be solved by robotic indexing and the use of search engines. In chemistry, for example, we have developed text-mining techniques which can recognise as chemicals over 80% of terms in mainstream publications, and identify a similar percentage. Our tools could rapidly index the scientific chemical web and add significant semantic value.

The biggest problem, however, is that many publishers forbid or obstruct this activity. Most chemistry journals are closed and thereby immediately inaccessible to many. Even for subscribers there are usually lengthy licences which are fuzzy and difficult even for experts to interpret. There is an imbedded fear of offending publishers' conditions either because of breaking copyright (even unintentionally) or being cut off by the publishers machinery (anecdotally very common). Many publishers specifically forbid robotic indexing.

The problem is solved for any "Open Access" publisher that adopts the spirit of the BBB declarations. Taken logically BBB requires that all content can be indexed and downloaded without permission. Unfortunately many publishers use "Open Access" but decorate their web site with additional licence conditions which are logically and ethically incompatible.

The label "Open Access" is a weak tool when describing access to, and re-use of, data. I and others have promoted the term "Open Data" (http://en.wikipedia.org/wiki/Open_data and references therein) to describe the need to consider data as a critical resource which needs political and legal activity. The use of Creative/Science Commons licences is extremely valuable but will need refinement as the principles of Open Access and Open Source do not translate automatically to data.

I shall give demonstrations of Open Data resources and outline some of the issues that the scholarly community must address rapidly if we are not to be impoverished by the "land grabbers" in the digital dataverse. We need a radical rethink of conventional information protection and need to be braver and more outspoken.

[Note: This was written pre-PRISM. I am concerned that if PRISM has any traction it will impact on Open Data as well as Open Access and will blog this later.]

Emotion and logic and PRISM

I've taken a week off blogging to write code and woken up to find I have nearly missed PRISM. PRISM is a publishers' alliance which appears to be solely devoted to protecting twentieth century business methods by whatever process is expedient. I've come to the sad position that, unless I breathe deeply, I take the default position that publishers are a problem to be overcome, not part of the way forward.
So I breathe deeply. I work with some wonderful people in the publishing industry. The list isn't exhaustive and I hope they aren't embarrassed:
  • Timo Hannay from Nature. An early sponsor of our work and champion of innovation
  • David James, Richard Kidd and Colin Batchelor from RSC (and Alan McNaught) who have supportour work for several years. Colin was here on Tuesday involved in developing methods for semantic chemistry
  • David Martinsen from ACS, who has consistently supported new ideas and run the spring ACS meeting on new ideas in publishing
  • Brian McMahon and Peter Strickland from IUCr who have also supported our work and built superb scientific semantics.
PMR: So it's sad to see the other side - the industry reacting viscerally to threats. Here is Bill Hooker, reporting Peter Suber and adding comments:

From Peter Suber:

The AAP/PSP has launched PRISM (Partnership for Research Integrity in Science & Medicine). I'm quoting today's press release in its entirety so that I can respond to it at length:

A new initiative was announced today to bring together like minded scholarly societies, publishers, researchers and other professionals in an effort to safeguard the scientific and medical peer-review process and educate the public about the risks of proposed government interference with the scholarly communication process.[much egregious lying]

Anyone who wishes to sign on to the PRISM Principles may do so on the site.

Bill: Fortunately for us all, Peter has already responded; I won't excerpt his point-by-point rebuttal here, you should go read it all.This is disgusting. This runs counter to everything that science, academia, scholarship (and scholarly publishing!) stand for.

There are no names on the PRISM site yet -- but I'm going to find as many as I can and publish them here. Sunlight is the best disinfectant, and I want to know just who is taking part in this revolting effort to steal from the commons and turn public goods into private profit.

(We can start with the AAP: their members page is essentially one long list of companies and organizations with whom I will assiduously avoid doing business until and unless they dissociate themselves from PRISM, and preferably from the AAP altogether.)

More later. Oh yes indeedy.

PMR: The arguments from the PRISM community are not new - primarily that OA destroys peer review and therefore science/scholarship. This is, of course, completely fallacious. If you wish to see a clinical dismissal of the publishers' position read PeterS. Otherwise imbibe the raw emotion of Bill.
PMR: To amplify (again reported by PeterS):

Tom Wilson, Publisher panic, Information Research Weblog, August 24, 2007.

The commercial journal publishers are really in a state of panic. Reports from various sources point to their launch of PRISM: The Partnership for Research Integrity in Science & Medicine, a lobby organization to help them try to persuade the US Congress (and presumably Parliament in the UK) to ban Open Access. Of course, they don't say that: we have the usual weasel-worded statement that lobby organizations in the USA seem to be adept at....

[On the alleged threat to peer review] they are simply lying, and they know it. Free OA, scholarly journals operate the same peer review process as do commercial journals: if they didn't scholars wouldn't publish in them, but free, collaboratively supported journals are growing in number and take away submissions from the commercial journals, which will find it harder and harder to maintain quality....

What this recent initiative by the publishers points to is that the only sure way for the scholarly communities to take charge of the scholarly communication process is to rid themselves of their commercial exploiters and promote the publication of free, collaboratively produced and subsidised journals....

PMR: What disappoints me is that few of the conventional publishers have taken a positive view about the future. The future is EXCITING. The publishers are obstructing us getting there. Even the more forward-looking ones.
Part of the problem is that publishing is a cross between a public service and a commercial business. It hasn't worked out where it stands and where it should stand. It is becoming increasingly clear that if it takes the business route it is will go down the video media route typified by the appalling FACT [1] adverts on DVDs. (These are the ubiquitous adverts telling you what will happen if you copy the DVD you have bought or rented. It really sets the scene for an evening's watching. Perhaps we should have:
"You wouldn't steal a car?"
"You wouldn't steal a TV?"
"If you read a scientific paper you are not entitled to this is THEFT!!!!"
And it should be mandatory to have to read this declaration for 30 seconds before you are allowed to read the paper.
After all I am not just a scientific reader of a paper, I am a potential thief. And I should be told what dire fate awaits me if I dare to read scientific research I haven't paid for. I shall have more replies from publishers to publish shortly.
Meanwhile back to Java.
[1] ADDED LATER. FACT is the Federation against Copyright Theft - at least in the UK.  Every time you watch a movie - at home or in the cinema - you are treated to an obligatory series of advertisements about the immorality, illegality and cost-ineffectiveness of pirate videos and movies. For many people it spoils the experience of the work. That's increasingly how I feel about the conventional publishing industry. Having my work described as "junk science" when it is published in Open Access journals is simply an illiterate insult. Having Open Access described as "ethically flawed" is as bad.
Publishers should be enhancing the process and quality of scholarly communication. A  publication should be something in which all can take some pride, not simply a piece of commerce to be defended.

What do we mean by open science?

There seems to be a critical mass of activity in the Open Science camp - possibly sparked off (or at least given amplification) by scifoo. Here is a very useful summary from Bill Hooker (Timo, invite Bill to scifoo next year). Bill missed the Second life event (so did I and I'm disappointed, but I really had other things to do)...

(Addressed in absentia to "Tools for Open Science", Second Life, Aug 20 2007. I am sorry I could not be there.)I think we all know what we want, and I think we all want much the same thing, which boils down to just this: cooperation. A way forward for science, a way out of the spiralling inefficiency of patent thickets, secret experiments and dog-eat-dog competition. But we use a variety of terms, and probably mean slightly different things even when we use the same terms. It might -- I am not sure -- be useful at this point to come together on an agreed definition for an agreed term or set of terms -- something equivalent to the Berlin/Bethesda/Budapest Open Access Declarations.

If this does not seem like a "tool for open science", consider what the BBB definition has done for Open Access. It provides cohesion, a point of reference and a standard introduction for newcomers, and acts as a nucleation center for an effective movement with clear and agreed goals. Since this SL session takes off from SciFoo, and SciFoo is by all accounts very good at converting brainstorming sessions into practical outcomes, I thought perhaps the idea of a definition or declaration of Open Science might be a suitable topic. In what I hope is the spirit of SciFoo, here are some ideas that might be useful in such a discussion.

Terms

Whatever this thing is, what should we call it? There are a number of terms in use:

  • Open Science -- has the weight of Creative Commons/Science Commons behind it, via iCommons
  • Open Source Science -- Jamais Cascio, Chemists Without Borders
  • Open Source Biology -- Molecular Biosciences Institute
  • I think "biology" too narrow -- there seems little point in Open Chemistry, Open Microbiology, Open Foo all having different names. I think Open Source Foo too likely to lead to confusion with software initiatives, and too likely to lead to pointless arguments about what the "source code" is.
  • That leaves Open Science, which would be my choice for an umbrella term. A case can be made, though, for Open Research, on the same basis on which I argue against Open Biology etc -- see this comment from Matthias Röder
  • Another "inclusive" possibility is to focus on information -- Open Data, as per PMR's wikipedia entry, or the broader Open Content. In the same vein, the Open Knowledge Foundation provides a fairly comprehensive definition of Open Knowledge.
  • I have seen "Science 2.0" around quite a bit lately, though it's a bit too marketing-speak for my taste
  • Open Notebook Science is a very specific subset of Open Science: if your notebook is open to the world, there's not much confusion about access barriers! It even comes with its own motto: "no insider information". This is as Open as Open gets.

Sources and Models

We don't have to re-invent the wheel:

Flexibility

We don't want to start a cult, and we don't want to bog anyone down in semantics. There's no purity test or loyalty oath. My own view is that Open Science (or whatever we end up calling it) is not an ideology but an hypothesis: that openly shared, collaborative research models will prove more productive than the highly competitive "standard model" under which we now operate.

Openness in scientific research covers a range of practices, from tentative explorations with a single small side-project all the way to Open Notebook Science á la Jean-Claude, and we should welcome every step away from the current hypercompetitive model. Open Notebook Science provides a useful marker for the Open end of the spectrum; perhaps all a Declaration need do is identify the minimum requirements that mark the other end of the spectrum?


Conditions

What standards must a research project or programme meet in order to be considered Open?

  • obvious: Open Access publication
  • equally crucial: Open Data, that is, raw data as freely available (including machine access) as OA text
  • probably indispensable: Open Licensing so as to avoid confusion as to what is truly available and for what purposes; as per Peters Suber and Murray-Rust, this must be
    • explicit
    • conspicuous
    • machine-readable
  • Open Semantics: perhaps none of this will be much good without metadata and standards to allow interoperability and free flow of information
  • desirable: Free/Open Source Software
  • David Wiley: "four Rs" of Open Content (cf. Stallman's four fundamental freedoms for software):
    • Reuse - Use the work verbatim, just exactly as you found it
    • Rework - Alter or transform the work so that it better meets your needs
    • Remix - Combine the (verbatim or altered) work with other works to better meet your needs
    • Redistribute - Share the verbatim work, the reworked work, or the remixed work with others
  • OKF definition of Open Knowledge

PMR: This is really useful. I can't think of significant alterations. No-one is suggesting that science is altruistic - it can be hard and cruel as well as beautiful. And science doesn't care who wins, but knows that the more who play by the rules the greater the progress and enlightenment.

Open availability of tools, methods, specimens, results, recipes, codes, data, etc. MUST enhance science. Not providing them simply impoverishes the field and provides personal gain at the expense of the rest. Scientists are people and they want to succeed personally.

I am very fortunate that the scientists I have known and who have acted as my mentors have been fantastic people. They have nurtured younger scientists, built a sense of community, fostered international science, cared about the human race. That is not a necessary part of science, but it is sufficiently common that it is worth striving for even if, occasionally, it leads to a non-optimal decision in the prisoner's dilemma.

scifoo: Cameron Neylon on Open Notebook Science

More on Open Science from Jean-Claude Bradley. It's sad to see how paper-driven we have become. It's critical to publish, but I continually sense there is an increasing pressure of "I need a paper - what's the most cost-effective way of getting one"? This is Jean Claude on Cameron Neylon

22:22 23/08/2007, Jean-Claude Bradley, Useful Chemistry
There has been a lot of discussion lately about the philosophy of Open Science in general terms.

This is certainly worthwhile but I think it is even more interesting to discuss the mechanics of its implementation. That is what I was trying to push a little more by setting up the "Tools of Open Science" session on SciFoo Lives On.

That's why I've been very impressed by Cameron Neylon's recent posts in his blog "Science in the Open".

He has been discussing details of the brand of Open Science that interests me most: Open Notebook Science, where a researcher's laboratory notebook is completely public.

Cameron has been looking at how our UsefulChem experiments could be mapped onto his system and this has sparked off some interesting discussion. I am becoming more convinced than ever that the differences between how scientific fields and individual researchers operate are much deeper than we usually assume.

By focussing almost entirely on the sausage (traditional articles), we tend to forget just how bloody it actually is to make it and we probably assume that everybody makes their sausage the same way.

The basic paradigm of generating a hypothesis then attempting to prove it false is certainly a cornerstone of the scientific process but it is certainly not the whole story. However, after reading a lot of papers and proposals, one gets the impression that science is done as an orderly repetition of that process.

What I have observed in my own career, after working and collaborating with several chemists, most of the experiments we do are done for the purpose of writing papers! The reasoning is that if it is not published in a journal, it never happened. This often leads to the syndrome of sunk costs, similar to a gambler throwing good money after bad, trying to win back his initial loss.

After a usually brief discovery phase, the logical scientist will try to conceive of the fewest number of experiments (preferably of lowest cost and difficulty) to obtain a paper. In this system, like in a courtroom, an unambiguous story and conclusion is the prefered outcome. Reality rarely cooperates that easily and that is why the selection of experiments to perform is truly an artform.

We're currently going through that process. We have an interesting result observed for a few compounds and a working hypothesis. That's not enough for a paper in my field. We cannot prove the hypothesis without doing an infinite number of experiments but we are expected to make a decent attempt at trying to falsify it. I know from experience roughly the number of experiments we need with clear cut outcomes to write a traditional paper.

So how much more value to the scientific community is that paper relative to the single experiment where this effect was first disclosed on our wiki then summarized on our blog?

Is this really the most efficient system for doing science or is this the tail wagging the dog?

When the scientific process becomes more automated, I predict that the single experiments will be of more value than standard articles created for human consumption and career validation.

[...]
One of the most useful outcomes of Open Notebook Science (and why I'm highlighting Cameron's work) might be the insight it will bring to the science of how science actually gets done. (Researchers like Heather Piwowar should appreciate that)

This is where it starts - the passion, the innovation and publicity of people who want to change the current complacency. The exciting thing is that the Internet makes that possible. Within months.

scifoo: the mindless impact factory

More scifoo follow up from Richard Akerman. No comment from me needed. I'm leaving the second life picture because ...

open science and the impact factory


Jean-Claude Bradley instigated a session in Second Life - SciFoo Lives On: Open Science.

[SF-SL-004]

Next week will be something like "Medicine 2.0".

You can see in the transcript that one part of SciFoo that definitely lived on was a discussion around Open Science and webliometrics, both definitions and how to handle impact. It seems to me that we get tangled in endless debates about definitions. I have proposed that the nodalpoint Open Science wiki page be used to come to a consensus definition, but in the meantime:

open science
opening your scientific activities up to public examination, making work available without it having gone through formal peer review
peer review
The process of a group of scientific peers assessing the quality of a submitted piece of scientific work, currently most commonly associated with gatekeeping into a scientific publication, wherein it may also involve aspects of improving both the scientific thinking used in the paper and the expression thereof. There is no relationship between peer review and closed or open access.
open access
making a publication available without subscription fee, but possibly with usage limitations
free access
unfortunate term due to existing definition of open access, adding element of unrestricted usage and reuse (e.g. text mining)
impact factor
An imperfect measure of the scientific "importance" of an entire journal. Misused to measure the quality of individual scientific output

(Marked up using HTML definition lists, which you have probably never heard of, which incidentally is why the Semantic Web will fail.)

Yes, there are many types of peer review in different disciplines, and yes, things are often considered published and citable without having gone through peer review, such as conference papers and presentations which often go through a sort of editorial board selection instead.
I know these definitions are far from perfect, but good lord, can we get to good enough and go beyond this debate?

What I keep hearing is, how can we impact factorize open science. Well, the answer is, you can't. Let's stop trying to find some magic algorithm whereby a machine tells us what quality science is. What's completely mad to me about this is that we already have processes to assess science quality. Every time you review a new student, every time you look at a grant proposal, heck, even on the infamous tenure committees and research assessments, a group of humans looks at a portfolio of existing or proposed work, and decides whether it is good enough.

So if I may modestly propose, let's continue to do that, and no one other than journal publishers should ever look at impact factor numbers again. Arise, qualitative assessment, begone quantitive nonsense.

There is still a place for technology, but it's not in providing some bogus seemly-quantitative quality measure. It's in enabling us all to present our scientific portfolios online, or to use Euan's words, our "professional lifestreams". And there is a real problem to be solved. It starts with students and their scholarly output stuck in closed university systems. Students move around. Scientists move around. Their work history should move with them, not be lost in some scholarly dark web, or frozen as some web page at their previous institution that they no longer can access.

The European e-Portfolio is one effort to address this for students.
Electronic Theses and Dissertations is another piece.
The next step is to have those integrate into some, shall we say, flow or... flux (sekrit inside Nature joke) of the rest of their scholarly activity when they graduate. Bookmarks created, databases curated, papers reviewed, etc. etc.

That's the technology piece.

The other piece, however, cannot be solved with technology.

Find better ways for humans to review scholarly portfolios and make decisions based on them. That's going to address this problem of evaluation far better than anything else.

SIDEBAR
And of course you can do some side bits with technology of course once you have all this info circulating around, like ranking relevance to help people find the best, most relevant work in the flood of science that is sloshing around. Usage factor, other metrics, these may all help in recommending things to read.
END SIDEBAR

References

Richard Monastersky, "The Number That's Devouring Science", Chronicle of Higher Education, Volume 52, Issue 8, Page A12 (2005)

The PLoS Medicine Editors, "The Impact Factor Game", PLoS Med 3(6): e291 doi:10.1371/journal.pmed.0030291 (2006)

Peter A. Lawrence, "The Mismeasurement of Science". Current Biology, 17 (15), r583. doi:10.1016/j.cub.2007.06.014 (2007)

Bruno Granier, "Impact of research assessment on scientific publication in Earth Sciences" (PDF), a presentation at ICSTI June 2007 Public Conference on Assessing the quality and impact of research: practices and initiatives in scholarly information

Richard Akerman, "Web tools for peer reviewers...and everyone" (PDF), a presentation at ICSTI June 2007 Public Conference on Assessing the quality and impact of research: practices and initiatives in scholarly information

Corie Lok, "Scifoo: day 1; open science" (2007)

Alex Palazzo, "Scifoo - Day 2 - Science Communication" (2007)

Alex Palazzo, "Scifoo - Day 3 (well that was yesterday, but I just didn't have the time ...)" (2007)

Previously:
June 2007 Science Library Pad: ICSTI 2007 category

berlin5: Berlin 5 Open Access

I am delighted to have been asked to present on the topic of Open Data at "Berlin -5".

The University of Padua, the CRUI (Council of Rectors of Italian Universities) and the Max Planck Gesellschaft are pleased to announce that the fifth conference in the “Berlin Declaration” tradition will take place in September 19-21, 2007 in Padua, Italy, with the title “Berlin 5 Open Access: From Practice to Impact: Consequences of Knowledge Dissemination”.The aim of the conference will be to bring together the various initiatives and key players within the Open Access movement in order to:

— maintain the enthusiasm of all people involved in the Open Access field,

— have an overview of the developing tools that sustain Open Access in scientific data and cultural heritage dissemination,

— develop the effective strategies that can contribute to the construction and implementation of this new paradigm of the scholarly communication world.

Further details are available on the conference website.

Program

The general subjects of the conference will focus on:

a) state-of-the-art of the sharing of the Berlin Declaration vision: survey on the impact of the new paradigm in the institutions that signed the declaration; supporting bodies policies and activities in favour of innovative scholarly communication processes;

b) the Open Access scene in the developing countries and emerging economies: strategies, achievements, impact;

c) Open Access and the e-science: how to support the free circulation of scientific raw data to facilitate cooperation and effective reuse;

d) e-publishing: the emerging of new strategies in scientific data dissemination; estimate of the impact in OA journals: new tools for scholarly evaluation in the growing layer of Open Access publications; the perspective of a changing landscape in the scientific journals policies; progress reports on the transition from reader-pays to author-pays models;

e) ICT developments and collaborations that support e-publishing and Open Access.

Further details are available on the conference website.

It's very useful to tag pre-conference posts so that attenders can get an idea of the issues. This works very well with ICT-conferences - zillions of posts on www2007, scifoo, etc. So I'm tagging this with berlin5 and suggest that anyone interested do the same. I will probably manage a few posts before the meeting and hope to report some back on this blog.

Special issue of CTWatch on the coming revolution in scholarly communication

I have been busy with grants and hacking so have been away from the blog. (Making good progress on new ways of inputting and displaying chemistry). Here is a very important set of papers which are all highly relevant to this blog. I had hoped to find time to comment on each individually, but for now here is the table of contents.

18:45 18/08/2007, Peter Suber, Open Access News

The August issue of Cyberinfrasctructure Technology Watch (CTWatch) is devoted to The Coming Revolution in Scholarly Communication and Cyberinfractructure. All the articles are OA-related:

I will return to these as and when.

Also my mind is starting to turn to the Berlin-5 meeting  ... next post

scifoo: One chemical per one laptop?

On the Open Knowledge Foundation blog I noticed a call for projects related to One Laptop Per Child (which we saw at scifoo). I'm wondering what we could do in chemistry - there is so much around and so much that would be fun to do...


Tomorrow is the first day of the Northern Summer of Content 2007. The Summer of Content is an initiative of WikiEducator and the One Laptop Per Child project. Inspired by Google’s Summer of Code, the programme aims to match creators with mentors and stipends to “develop open content and run free culture events throughout the world”. The Northern pilot will run until the end of September and a Southern version will run from December 2007 to February 2008.The organisers place an emphasis on community in content production, and aim to create what they call “a self-supporting networked ecosystem of projects”. They aim to educate participants about open licensing, meta-data and accessibility, as well as providing support for technical aspects of creating content. A list of proposed projects can be found here.

PMR: Here are the current ideas:

Articles in category "Summer of Content proposals"

There are 33 articles in this category.

C

D

F

G

H

J

L

M

O

O cont.

P

S

T

W

PMR: I haven't read these but what could we do in chemistry to create content? Wikipedia has fantastically good chemistry (even though most "academic" chemists aren't interested and don't contribute). It would be easy to do it on a one-compound-per-laptop - each volunteer gets one compound to find out as much as they can. Or, perhaps, one product. Many products have a list of ingredients - I have a mineral water bottle that has Calcium, Magnesium, Potassium, Sodium, Bicarbonate, chloride, fluoride, nitrate, sulphate (sic). Since most kids can't do real experiments in the classroom any more, here's a list of real chemicals doing real things. And there are lots of Wikis and blogs in the chemical blogosphere that might be interested.
But I am sure that others have more exciting ideas than this.

When does open science work?

It's funny how things turn out in the blogosphere. I'd posted about how ludicrous copyright on dead scientists' work (Copyright madness - story 2) was and expected some comment from the librarian community. Silence (there's still time to comment!). I got a brief exclamation of horror from BlackKnight and to check that this wasn't spam visited his blog. and I saw the Green Fluorescent Wow! My comments about this example of immediate Open Notebook Science has turned into a thread on when and whether to publish results on blogs. Here's Black Knight:

Whee. I checked Technorati this evening, as you do (seeing as the bastard spammers have destroyed the usefulness of trackbacks), and discovered that yesterday's post was spreading ripples in the blogospheric pond. It came first to the attention of Peter Murray-Rust, who has a thing (a good thing! — I hasten to add) about open access and open science in general, and thence to the open science community itself, in the shape of Cameron Neylon.

Funnily enough, Neil Saunders then picked it up from the OpenWetWare people, and I do some digging and find that not only did Neil do a DPhil but that he is now in Bostjan Kobe's lab. Bostjan is a long-time collaborator of my previous boss and from meeting him at Lorne he seems to have quite forgiven me for not going to work for him when I had the chance (or forgotten about it).

Brisbane, bleh.

So, anyway, it turns out that I had previously made contact with OpenWetWare, and talked about them over a year ago. Which just goes to show that (a) it's a small world and (b) incest is more fun than you'd think.

But all that is not really what I wanted to write about now. The OpenWetWare (have you any idea how difficult it is to type that?) project is a laudable effort to promote collaboration within the life sciences. And this is cool, but then I realize that the devil is in the details.

Share my methods? Yeah! Put in some technical detail? Yea–hang on.

For sure, the 'Green Fluorescent Wow!' experiment (HT to Peter) was pretty simple and straightforward: An easy cloning experiment with a slight cleverness in choice of reagents, no IP and nothing particularly smart. But I've got other experiments underway that are clever, and potentially very exciting.

So can I write on my weblog about them? And how much detail can I give? If I say "My protein seems to do something odd to cell-motility", is that an elegant sufficiency of detail? Surely people will get bored with generalizations, but am I right to worry, as one of our PIs does, that I might compromise my project by posting too much detail? Should I really be posting pictures of cells that are doing odd things?

It's not a case of "Can I trust you bastards not to steal my work?" but balancing the ideal of 'open source science' with the need to publish before anyone else. I have responsibilities — to the boss and to my cow-orkers —, but I also want to share the fun and joy and heartache of this vocation.

So it's all a little bit confusing, really. I want to bounce when experiments work, and scream and shout when I have a 'little technical difficulty'; but how much can I say without compromising stuff? Seriously, I have lots I want to write about, but am not sure whether I should.

Comments (Cameron Neylon)

I think this is the real key to the whole thing. Will it compromise your work/career/future happiness? If everyone shared and was honest then it should work. What we need is some game theory/evolutionary biology person to tell us how many people it takes before we can support freeloaders.

But I agree it takes a shift in the way science works for people, especially postdocs, to be ready to risk making their knowledge base available. It would be absolutely key for people to get credit, and citations in some form, for making protocols/data available.

  • I don't think anyone is suggesting that all science everywhere be automatically Open (e.g. there is - as yet - no Richard Stallman of Open Notebook Science. At at obvious level we have industrially sponsored projects in our group and we are required (and quite happy to be required) to check all new discoveries with the sponsors.
  • It's very domain-dependent. In some areas such as maths and physics the idea can be the whole thing. A few seconds - such as the Watson-Crick DNA model or the Franklin data can communicate the whole message in a few seconds. But most science is made up of tedious work, much of which doesn't.
  • It's often thought that the "idea" is the most important thing. And sometimes it is. But most ideas don't work out . Who has not had the grey-haired community (I'm one) telling them that it won't work? Are they are sometimes right. So in many cases the credit goes to someone who has stuck doggedly at making a half-baked idea work.
  • There are, however, many cases where "your idea" has already been tried elsewhere and failed, and the reasons for its failure are documented and possibly understood. An awful lot of this happens at bars in conferences. "We're trying to see how protein X might interact with Y" (not giving too much away). "Have you made Y?". "No, we're trying A's method." "Oh, we tried that for 2 years and we couldn't get it to work; and A also published Z which didn't work either. So we distrust stuff from that lab".  If this is not disinformation then it's very valuable and could save wasted years of work. Remember that the unproductive work has to be put on the balance sheet as well.
  • So - as Cameron says - it's a game. You estimate the value of releasing your idea to the value of not releasing it. Either could be positive or negative.
But there are other pluses to publishing work on the Web:
  • You make a reputation. I'm not hiring green fluorescent rats but if I were I would recognise the applicant.
  • Your contacts may be genuine collaborators, not competitors. So, for example, if I were interested in a multidisciplinary programme on malaria including medicinal chemistry I would certainly wish to make contact with Jean-Claude Bradley. Indeed if I were interested in an open programme of screening compounds (as is the NIH) I would also make contact.

So "I've got something novel - I'm looking for collaborators" will become more common in the electronic era. Some of this will be public - others (as we heard at scifoo) will be mediated though brokers in private. Perhaps we'll see embargo periods - publish the science into a closed arena for a few months with a requirement that it then becomes public. Ideas and novel science are too valuable to be allowed to decay.