Monthly Archives: July 2008


I have been off the air for some time because of travel and also technical problems with the blog which Jim Downing has solved (thanks).


I hope to blog soon about data, repositories and escience among various topics.

ESOF2008 Alma Swan's session

Alma Swan has organised the session at Barcelona ESOF : ( [Saturday, 18th July, 1630).

Sharing scientific data: who benefits?
Alma Swan, Key Perspectives Ltd, United Kingdom

Abstract: Digital datasets—text-based, numeric, audio, video or image-based—form the output of all scientific disciplines. How are these data being made available for sharing? What quality control mechanisms are in place? What kinds of naming conventions, tags, and metadata are in use and how effective are they at helping to manage open data? Who is storing, archiving and curating open data and at which levels? And how is the production and sharing of open data assessed: what processes are in place for crediting scientists for making their raw data openly accessible for sharing and re-use. How much can and should data publication replace traditional forms of publication of research findings?


We have been so busy with our summer program - semantic authoring and capturing of chemistry - that I haven't had a breathing space. I'll be blogging more about that. However a change of scene - tomorrow I'm in Barcelona at ESOF: The Euroscience Open Forum. I'll post more later. It's very important that Eurpe is a world leader in this arena.

ESOF: The Euroscience Open Forum

About ESOF

ESOF2008 logoFor too long, Europe was lacking an independent arena for open dialogue on the role of all the sciences, including the humanities, in society. We have it now with the Euroscience Open Forum. The initiative was taken in 1999 by the researchers themselves: the Euroscience Open Forum was brought to life by Euroscience.
Euroscience recognised the need for an interdisciplinary, pan-European meeting place for open dialogue and the exchange of ideas.

Visit the ESOF2008 web site

The ESOF concept

Science and technology are becoming increasingly important as they concern and affect everybody. The Euroscience Open Forum is not an ordinary scientific conference, but a totally new concept. It consists of a Forum for discussion of topical issues, an embedded conference (with an exhibition) to showcase European achievements right across the scientific and technological spectrum, and an outreach programme.

The outreach programme consists of a large number of events and happenings throughout the ESOF host city, which are targeted to the public at large of all ages. At ESOF2004 in Stockholm, the outreach programme “Science in the City” attracted 11000 visitors. At ESOF2006, the outreach programme was linked to the “Wissenschaftssommer”, attracting some 60000 visitors.

ESOF also serves as a young scientists’ forum, encouraging students, PhD students and post-docs to share their experience and participate in debates about such subjects as the European Charter for Researchers, how to motivate young people to engage in scientific careers, and how the construction of the European Research Area enhances the prospects of young scientists.

ESOF’s aims are:

* Presenting scientific and technological developments at the cutting edge in all their variety from natural sciences to the social sciences and the humanities
* Stimulating the European public’s awareness of and interest in science and technology
* Fostering a European dialogue on science and technology, society and policy by offering a platform for cross-disciplinary interaction and communication on current trends and future roads for science and technology, their interaction with society and policy and the role of the public

ESOF’s European itinerary

The Euroscience Open Forum is held every other year, visiting the major scientific cities of Europe and bringing European science to the attention of all citizens.
The starting point of ESOF’s European journey was Stockholm, Sweden, in 2004. Two years later, ESOF’s itinerary brought the vent to Munich, Germany. And, after ESOF2006, the route will continue southwards : ESOF2008 will be held in the capital of Catalonia, Barcelona, Spain. ESOF’s exciting host cities reflect Europe’s cultural diversity. Thus, you will experience that the spirit of every Euroscience Open Forum is different…

ESOF’s success depends on you, too!

You can contribute to this open dialogue on all the sciences and on their role in shaping a knowledge-based society.
ESOF invites individuals and organisations to submit their best ideas in the form of proposals for the programme. The best of these proposals will be selected for the Forum by a Programme Committee of international standing.

For information about ESOF2008, please visit (

You can also propose the next destination for ESOF’s travel plans. For further information, please contact us:

John Sulston calls for reform of IPR policy

Whether you support Open Access and Open Data or believe that Closed Access and patents are the best way of promoting high quality science, there is no doubt about the fact that restrictions on access to IPR area major drain on scientific effort. We all spend a significant point of time having to investigate contracts, and finding out whether or not we can actually do something. Now John Sulston has spoken out:

John Sulston, recipient of the 2002 Nobel Prize for medicine, has launched a new research institute, the Institute for Science, Ethics and Innovation at the University of Manchester. Sulston is using the launch to highlight his views on openness in science and the need to reform innovation and intellectual property policy. (Thanks to Subbiah Arunachalam.)

See the op-ed co-authored by Sulston and Joseph Stiglitz in the July 5 edition of The Times:

... The question of “Who owns science?” is therefore a crucial one, the answer to which will have broad-reaching implications for scientific progress and for the way in which the benefits of science are distributed, fairly or otherwise. Two of the most pressing issues concern equity of access to scientific knowledge and the useful products that arise from that knowledge. ...

The second issue we wish to highlight is that of access to science itself. The ideal shared by almost all scientists is that science should be open and transparent, not just in its practices and procedures, but so that the results and the knowledge generated through research should be freely accessible to all. There is a broad consensus in the scientific community that such openness and transparency promotes the advancement of science and enhances the likelihood that the benefits of science are enjoyed by all. For more than a hundred years, these principles have been the bedrock of academia and the scientific community.

We call upon all interested in the future of science to join with us in an active and open-ended search for answers.

See also coverage in The Times and the BBC.

PMR: I hope that this message finds its way to the policy makers in academia as they have the power and the responsibility to act. In many cases the academic staff are unable to find the information they want or to allow it to reach those that they would hope to collaborate with. Not only are there patent and copyright restrictions, but universities often sign draconian contracts with the gatekeepers of scientific information. For example software companies can revoke licences or even sue the universities if we publicize bugs in the program. Publishers require libraries to sign contracts that forbid the use of the information in ways that individual staff don't even know about. It's only hearsay but I understand that these can include "excessive downloads" or data-mining.

In no way can any of this be seen as anything other than holding science back.

In praise of Undergraduates

One of the highlights of my year is our summer program of undergraduate projects in the Centre. We've done this for six years and each student spends 8-10 weeks working on projects in Molecular Informatics.

I have been astonished and delighted by what the students have been able to achieve and the lasting legacy they have left and are continuing to leave. I'm leaving out names and will speak in general terms. The students are usually sponsored by an external organisation and we have built up good relations with quite a number - such as publishers and pharma companies. Some students are also supported by the Department, and some by Unilever. We advertise by word of mouth and by the subject email lists. In general the number of positions has roughly matched the number of applicants - this year we have four projects which are all filled and I hope to talk more about them in this blog.

Oscar - our chemical text- and data-mining/processing facility sprang from summer projects (support from Royal Society of Chemistry and Nature Publishing Group). I am consistently delighted with the standard of the Oscar summer software - the Experimental Data Checker has run for nearly 5 years without needing any software support. CrystalEye sprang from a summer project sponsored by the International Union of Crystallography.

You might think that 2 months is too little time to do anything useful, and most of the time you would be wrong. It's not uncommon to start getting useful material in the first week. This is in some part because we would as a large team. Some of us the Centre members hot-desk into the "training area" and we work communally - fixing each others' probelms and discussing strategy.

Most of the students get to present to the sponsors and this has been very useful. One presented over a video link to the US office of the sponsor.

And there is a longer-term benefit - 5 of the students are now doing - or have just finished - PhDs with us. That has been an enormous benefit to the knowledge, expertise and culture of the Centre.

In more general terms, when anyone asks me how they are going to adjust to the rapid changes in modern thinking I advise them to include undergraduates in their team. If you are in the Library sector you have to understand how students think and act and the only way to do this is to work alongside them. You'll find that long-held views about metadata, bibliographies, customised databases, and the linear reading of articles no longer hold. The e-generation works differently. And it's often us who have to be educated.

I'm not involved in formal undergraduate education here (I have done some demonstrating) but if I were I would turn the system on its head and involve the students in preparing and delivering course material. They are oretty good at finding it, after all.

Open Access Data Repositories

Peter Suber has been working with colleagues to create a Wiki of Open Access Data repositories. From his blog

List of data repositories The Open Access Directory (OAD) list of Data repositories is now open for community editing.

OAD is a wiki, and you can help the cause by adding or revising entries to its lists.

Data repositories are becoming very important now and it's clear that they are primarily useful if they are Open. Some subjects such as bioscience have had a long history of Open data repositories - and if the Wiki listed every one it would dominate the field.

Of course there are lots of nuances to discuss. What is Data? and what is Open? And I've spent time on this blog discussing these. At present I'll just reiterate that we should label data as "Open Data" (from the Open Knowledge Foundation). And should protext freedom with Community Norms, not licences or contracts.

Every creator of an Open data resource should label it as such. All you need is:

This material is Open Knowlege

Research Repository System

Chris Rusbridge of the Digital Curation Centre (Edinburgh, UK) has come up with a great idea which I think has captured the zeitgeist. He started with "negative click" repository - and has mutated the name to Research Repository System. I was about to blog something just to say that I really supported his idea but hadn't time to comment more when I suddenly found SEVEN Posts on his blog.

Here's the latest post - it links back...

Research Repository System persistent storage

This is the seventh and last of a series of posts aiming to expand on the idea of the negative click, positive value repository, which I'm now calling a Research Repository System. I've suggested it should contain these elements:

  • spinoffs
  • I'll try to find time to add comments. However we are preoccupied and very actively building our own repository system here for crystallgraphic and chemical data in the Department and I'll be blogging bits as we go along. I'll try to keep in sync with Chris.

    For me the true repository system has to be invisible...perhaps in the way the web is going. Universal = invisible. But that will take a while