Open Bibliographic Data at JISC and #jiscopenbib

I attended the Open Bibliographic Data meeting in London and came away very excited. Here are a few thoughts.

Although bibliography is often regarded as a dry subject it’s actually incredibly relevant. We were asked for a compelling use-case for bibliography to be presented to Vice-Chancellors. Here was my suggestion:

Universities compete against each other. They do it in large part through bibliography. Really? Yes! The RAE or REF or whatever metric is increasingly based on bibliography. The Universities that manage their bibliographies best will be more visible in all sorts of metrics. Soton and QUT have a concerted policy on exposing their research, and they succeed. So a modern bibliographic tool is a sine qua non for a VC. I’m serious. Here are some questions that an Open bibliography could make a lot of contribution to:

  • What subjects does my University publish?
  • Which departments co-publish?
  • Which other universities does mine co-publish with?
  • Which universities are starting to eat my lunch?

Open Bibliography can answer those!

So here are some ideas that the meeting has converged on, drafted by Paul Miller. They’re not final, but Paul’s style is wonderfully brief and I doubt I could improve on it:

Universities should proceed on the presumption that their bibliographic data should be freely available for use and reuse. […]The default position remains transparency, unless the risk assessment can compellingly argue otherwise.

Just use CC-BY for creative works. Just use ODC-PDDL for facts.

DO NOT USE ‘Non-Commercial,’
DO NOT DEVELOP YOUR OWN LICENSE

That is all there is to it. By using open approaches you don’t have to explain, qualify, niggle. It simply works.

Given that, we can now develop powerful tools that speak directly to the world, including vice-chancellors. And, with open data, and open source they are cheap to build.

And #jiscopenbib is already building them.

Posted in Uncategorized | Leave a comment

The OABCD of Open Scholarship: In pictures

Is composed of the ABCD

OPEN ACCESS

Ben O’Steen has created B and C buttons in different sizes at https://bitbucket.org/beno/okfn_buttons/src

Please use these. They are environment-friendly

We expect to see these starting to appear on journal web pages and theses RSN

Posted in Uncategorized | Leave a comment

#OSS2010 Recording of my Open Science Summit Flowerpoint at Berkeley

Bryan Bishop has just released the videos from the Berkeley Open Science Summit which should go down as one of the seminal Open events of 2010.

http://fora.tv/partner/Open_Science_Summit

I am extremely grateful for being recorded

http://fora.tv/2010/07/29/Peter_Murray-Rust_Open_Knowledge_Foundation

There are a few places where I walked away from the microphone (I don’t really like static mikes, and I don’t think we had a pointer). I think the audience could hear but not the recorder.

Due to some wonderful (anonymous) volunteer there was a transcript which I have edited at:

http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=2506

My practical message was that we could reclaim our scholarship. That’s true, though I now prefer the phrase “Open Scholarship”. An Open Bibliography is practicable. It’s even more practicable than I thought in July. A number of technologies and people have come together and we have already prototyped parts of it. I’ll post on this later.

But, again, I have paid my homage to Berkeley. These images and sounds refresh and reinvigorate.

Posted in Uncategorized | Leave a comment

Panton Discussion 3: Richard Grant from The Scientist/Faculty_of_1000 and Open data

The Third Panton Discussion took place today at 2010-09-13:15:00 in the Panton Arms when Richard Grant of The Scientist/Faculty_of_1000 visited Cambridge.

 

 

I was the only OKF member able to be present, but this worked very well because it was really me-as-scientist that Richard wanted to talk to. As you can see the discussion was doubly recorded – Richard had a video camera and Adam Thorn had an audio recorder (near my hand).

After an introduction from Richard about F1000, etc. the conversation concentrated on Open Data – why? What are the difficulties? What are the incentives? We have got this all recorded and the next phase is to get it cut into chunks and post it, with intervening commentary.

Some of you are probably asking “where are the first two discussions?” It’s taking time and I and Brian have been pre-occupied, but we are going to chop them into snippets and post them. We are still keen for volunteers to transcribe, but we are not relying on this. So we’ll probably go a bit easy on the schedule until we’ve got into the swing of doing this.

Posted in Uncategorized | 2 Comments

#okfn Proposal for Open Theses and workshop at EURODOC 2011

One of the several functions of the Open Knowledge Foundation (http://www.okfn.org) is to support bottom-up projects in making knowledge Open and usable. Recently we have proposed an OKF project in Open Theses and Daniel Mietchen, an OKF volunteer from Jena, DE has proposed an exciting, valuable, and  realistic activity. He’s suggested a workshop on Open Theses next April at the annual conference of EURODOC http://eurodoc.net/ , to take place in Vilnius from March 31 till April 4, 2011. [1]. Note that Daniel has considerable experience in this area having been involved in Euro projects and meetings about graduates.

Open Theses aims to create a bibliography of Theses across the world, based largely on the technology and practice developed in the #jiscopenbib project (see blog: http://openbiblio.net/). The project will address bibliographic metadata which includes :

  • author, title and other normal bibliographic material;
  • thesis-specific material (degree, institution, etc.);
  • Open-specific metadata (e.g. what rights does the thesis carry – explicitly or implicitly)
  • packaging/containment of supplementary material
  • format of components.

The major problem in doing this at present is that:

  • there are often no comprehensive national or international bibliographies of theses
  • where there are they are often commercial
  • even when they are not the rights are not specified.

Open Theses will address this by (a) engaging with national and institutional bodies  (b) crowdsourcing, probably through graduates/graduands. The value of the OKF is that it naturally crosses national boundaries.

Open Theses has two roles/motivation for the “Open” concept:

  1. can the metadata be made Open? Bibliographic metadata must be available for re-use, editing, republication, etc. without restriction.
  2. is the content Open? Most students and most institutions don’t label their theses explicitly so we expect relatively few cases initially. We hope that this will highlight the question and alert graduate offices and archivers to its importance.

For the proposed workshop we’d like to introduce elements of (a) making something happen and (b) fun. The theme, therefore is provisionally:

“which European country or institution has the highest proportion of Open Theses?”


This allows all countries to compete. At the end of the workshop we should have a collection of Open Theses (i.e. Open metadata, some of which points to Open content)

Please let us know if you like this idea, wish to participate, have (bibliographic) material to contribute, etc.

 

[1] The current drafts of the program and the first circular are at
https://docs.google.com/fileview?id=11_Rg1csvo0sONiNlH2FaOY4p2n5glz2Ta5pi-RfM2_l0d1AM6dtmTgiugle7&hl=en&authkey=CN66hc4I
and
https://docs.google.com/fileview?id=1t35Nk8P_DyagcaktQSTv0QlnvoLlw7PAgkUlI31Q5pu_GNiDng9upgEFwtpr&hl=en&authkey=CMLa6a8J
.

Three parallel workshops are
foreseen so far: Integrity, Supervision and Gender.

[2] For a workshop on
Open Science that Daniel gave this year, see
http://eurodoc2010.doktorat.at/category/science-20/ .

Posted in Uncategorized | 2 Comments

#jiscopenbib; A vision of Open Bibliography

One of the exciting things about Opentech UK is meeting people who can contribute to your own work from the directions that you hadn’t thought of. So I met up with James Hetherington, from AMEE (http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=2601 ) a group that provides an open source carbon calculator. We got talking and the subject came to bibliography and our ideas of helping the clarity of climate research through an Open Bibliography approach.

He then showed me an impressive report from the Netherlands

http://www.pbl.nl/en/publications/2010/Assessing-an-IPCC-assessment.-An-analysis-of-statements-on-projected-regional-impacts-in-the-2007-report.html#

Where they had reviewed in detail the statements made in, or reported about, the IPCC report. This was a very careful analysis and categorised a large number of error types. I should emphasize that the conclusion was that the errors in the report or in the reporting of the report were not substantive. Nonetheless they showed that there were bibliographic errors in the report and categorised their types.

Two examples of the types of bibliographic error were:

  • Bibliographic entry was incorrect (e.g. typographical errors)
  • Referenced material did not exist (including disappearing web pages)

This is exactly the sort of thing that our Open Bibliography project #jiscopenbib has been set up to address. It encourages us to offer a bibliographic framework that might be used for validating the bibliography of climate added also got me thinking about how unit test methodology it might be used for bibliography.

Imagine that the world’s scholarship was referenced by an open-bibliography which was available pervasively. Every time somebody referenced an article or book or document or report that had an entry in this bibliography the system could immediately check whether the reference was correct. It could also follow the link to an online resource (and we expect that an increasing number of such links will be online) and check whether it existed. Any failings could be reported in the same way as software unit tests. The check could be run as frequently as desired, and certainly daily. For example a student writing a thesis could check their bibliography at frequent intervals.

There is no technical reason why this could not be put together very quickly, at least for electronic journals which are online. A simple collection of all the table of contents of all journals constitutes such a bibliography. This is straightforward to do from the online journals themselves and as far as we know breaks no intellectual property rights or contracts. Several publishers have already expressed support for the idea, which is to their own benefit since it advertises their material.

So part of what we will be building are the tools which allow us to do this.

If you’d like to help – whether you are an author, reader, publisher, funder or librarian – just let us know

Posted in Uncategorized | Leave a comment

@opentechuk The Digital Enlightenment is Now

I attended this year’s OpentechUK on Saturday and have been overwhelmed by the sense of the occasion. This is brought home to me by one of the opening presentations which showed how what we are doing here-an-and now is critical to our future.

Phil Booth from NO2ID gave one of the most compelling, clear and incisive presentations that I have heard recently. His message was simple. The grounds of the digital information age are being decided now. If we do not get them right we will live with the consequences for decades or even longer. He uses the phrase “database state” to emphasize that states now collect huge amounts of information on people and organizations and use these in ways that we do not even know about. I get the impression that much of this is simply embedded in the culture of control rather than any overt malicious intent. But this country has shown how spectacularly incompetent it is to manage the privacy and appropriate use of information and it is clear that we cannot assume by default that the state is responsible and that we can delegate our interests to it.

For that reason everything that we’re doing in many different organizations and institutions to liberate information is critical to our future. That’s why I am actively supporting the Open Knowledge Foundation because it is one of several places where the Digital Enlightenment is being worked out.

It may seem arrogant to compare ourselves with the pioneers of the Enlightenment but we know that we cannot judge our part in history. Phil gave a number of simple but clear principles about how we should act. I wasn’t able to recall them all (they may be available on Twitter) but here are some general conclusions I take away:

  • Never stop believing in and acting for what you consider to be right.
  • Make your message clear and simple.
  • Be pragmatic but do not compromise your principles.
  • Do not give up
  • Never assume that you have “won”. The struggle is continuous.

We are fortunate that in Britain we have a long history of the assertion of rights and that this is embedded in our (unwritten) constitution. We are allowed to protest, to campaign, and to say (almost everything) that we like. So many countries are not so fortunate and it is important that we show that the struggle matters.

Unfortunately these constitutional principles are not embedded in day to day government. We cannot assume that legislators and administrators will defend the rights by default. We have to identify the problems and campaign to get the right solution. Here for example is the innocuous sounding clause 152:

http://www.guardian.co.uk/commentisfree/libertycentral/2009/feb/28/convention-modern-liberty-information

which if it had been implemented would have seriously eroded our rights without anyone knowing. So, by default, the electronic era encourages the decay of previously a hard fought rights.

If we add up all problems of this sort, they appear overwhelming. It is because we are distributed and because we communicate that we can be effective. At Opentech there were many individuals and groups pursuing parts of the Digital Enlightenment. Nobody can do everything, and indeed all of us can only act in very small areas.

But it is the sum of all of those areas that makes it all work. My own area is Open Scholarship. It’s probably one of the more tractable areas. You will probably find that most of the people involved would agree that scholarship should be open and the issue appears to be primarily one of money rather than fundamental principles. On the assumption that over the next years we can achieve a recognition of Open Scholarship and start moving towards it on a broad front, we will not only achieve enlightenment in that specific area, but we will also encourage others lose battles are more challenging.

So my personal commitment is to work constructively towards a limited goal which is important and achievable. Opentech (and likewise OKCon) are great opportunities to reinforce our strengths, to find new allies, our own personal efficiency.

Posted in Uncategorized | 2 Comments

Panton Discussion #2 : David Dobbs

 

The second Panton Discussion was held yesterday on 2010-09-10:12:00 with David Dobbs. David (“writing on science, medicine, culture”, http://daviddobbs.net/ ) is a neuroscientist – though now a journalist – and here’s a recent article http://www.theatlantic.com/magazine/archive/2009/12/the-science-of-success/7761/ with some touching reader feedback. David is writing a piece for Wired (UK) due to come out in January and wanted some background on open Science and also on Mendeley. This gave us a chance to explore some of the issues relating to the philosophy and practice of Open Science.

The discussion was more fluid than PantonDiscussion#1 and I am not sure it’s necessarily useful to transcribe it. Brian Brooks is chopping the MP3 into chunks of a few megabytes each and – assuming there are about 20 – I’ll try to write a small commentary for the most interesting ones.

I’ll try to get this out in the next 24-36 hours. I don’t plan to do any editing, so please let us know what you think.

The next Panton Discussion is with Richard Grant (The Scientist / faculty of 1000) on Monday 2010-09-14

 

Clockwise from left: David Dobbs, Rufus Pollock, Peter Murray-Rust, Jordan Hatcher

 

The obligatory Panton shot…

Posted in Uncategorized | Leave a comment

#solo10: Green Chain Reaction – cleaning the data and next steps

Scraped/typed into Arcturus

Reactions to the Reaction.

I have now normalized the chemical names of the solvents in the Green Chain Reaction and will be posting results for each of the years. There are some exciting and important points

  • I believe that everything we produce can be distributed under an OKD-compliant licence – we shan’t distribute actual patents.
  • The names have been normalized and sanitized in a two-step process. This is because we have two world-class chemistry resources which are Open – Wikipedia and Pubchem (http://pubchem.ncbi.nlm.nih.gov/). Pubchem was initially under great threat from vested interests and some of us fought publicly for its future – as a results it’s a thriving collection of essentially all known chemical compounds. By contrast Wikipedia has compounds “of interest” but the information is very detailed and usually very clean,. This means that we can tell whether a term is a chemical or not and what its properties are.
  • This means that the process can be completely automatic. The results below are the solvents mentioned in a subset of patents published in 2000. You can see there are effectively no false positives. (Some solids will be suffixed with “solution”, as in “urea solution”. ) You’ll also see the linguistic variants that have been used by the authors – for example the first solvent – dichloromethane is often referred to by its chemical formula.

It’s clear that we now have a useful resource “Open Solvents” and I will be creating tables for all of the years. However it is now a good time for use to think about collecting more data if we are going to answer the main question of the Green Chain Reaction – is solvent use becoming greener. For that we will need the amounts of solvent used as well.


<compounds>

<compound pubchemID=”6344″ wikipediaUrl=”CH2Cl2″ count=”115″><name count=”62″>CH2Cl2</name><name count=”29″>methylene chloride</name><name count=”24″>dichloromethane</name></compound>

<compound pubchemID=”887″ wikipediaUrl=”methanol” count=”44″><name count=”36″>methanol</name><name count=”8″>MeOH</name></compound>

<compound pubchemID=”962″ wikipediaUrl=”H2O” count=”36″><name count=”6″>H2O</name><name count=”28″>water</name><name count=”2″>hydrates</name></compound>

<compound pubchemID=”702″ wikipediaUrl=”ethanol” count=”33″><name count=”29″>ethanol</name><name count=”4″>EtOH</name></compound>

<compound pubchemID=”180″ wikipediaUrl=”acetone” count=”19″><name count=”19″>acetone</name></compound>

<compound pubchemID=”679″ wikipediaUrl=”dimethyl_sulfoxide” count=”19″><name count=”12″>dimethyl sulfoxide</name><name count=”7″>DMSO</name></compound>

<compound pubchemID=”176″ wikipediaUrl=”acetic_acid” count=”14″><name count=”14″>acetic acid</name></compound>

<compound pubchemID=”1049″ wikipediaUrl=”pyridine” count=”14″><name count=”14″>pyridine</name></compound>

<compound pubchemID=”3283″ wikipediaUrl=”diethyl_ether” count=”7″><name count=”6″>diethyl ether</name><name count=”1″>Et2O</name></compound>

<compound pubchemID=”8058″ wikipediaUrl=”hexane” count=”7″><name count=”7″>hexane</name></compound>

<compound pubchemID=”6212″ wikipediaUrl=”chloroform” count=”7″><name count=”7″>chloroform</name></compound>

<compound pubchemID=”8174″ wikipediaUrl=”1-decanol” count=”6″><name count=”6″>1-decanol</name></compound>

<compound pubchemID=”6342″ wikipediaUrl=”acetonitrile” count=”5″><name count=”5″>acetonitrile</name></compound>

<compound pubchemID=”6328″ wikipediaUrl=”methyl_iodide” count=”5″><name count=”4″>methyl iodide</name><name count=”1″>iodomethane</name></compound>

<compound pubchemID=”24458″ wikipediaUrl=”FeCl2″ count=”4″><name count=”4″>FeCl2</name></compound>

<compound pubchemID=”6134″ wikipediaUrl=”lactose” count=”3″><name count=”3″>lactose</name></compound>

<compound pubchemID=”6342″ wikipediaUrl=”MeCN” count=”3″><name count=”3″>MeCN</name></compound>

<compound pubchemID=”8761″ wikipediaUrl=”bicine” count=”2″><name count=”2″>bicine</name></compound>

<compound pubchemID=”7964″ wikipediaUrl=”chlorobenzene” count=”2″><name count=”2″>chlorobenzene</name></compound>

<compound pubchemID=”944″ wikipediaUrl=”HNO3″ count=”2″><name count=”2″>HNO3</name></compound>

<compound pubchemID=”6569″ wikipediaUrl=”methylethyl_ketone” count=”2″><name count=”1″>methylethyl ketone</name><name count=”1″>ethyl methyl ketone</name></compound>

<compound pubchemID=”280″ wikipediaUrl=”carbon_dioxide” count=”2″><name count=”2″>carbon dioxide</name></compound>

<compound pubchemID=”3776″ wikipediaUrl=”isopropanol” count=”2″><name count=”2″>isopropanol</name></compound>

<compound pubchemID=”957″ wikipediaUrl=”1-octanol” count=”1″><name count=”1″>1-octanol</name></compound>

<compound pubchemID=”1176″ wikipediaUrl=”urea” count=”1″><name count=”1″>urea</name></compound>

<compound pubchemID=”313″ wikipediaUrl=”hydrochloric_acid” count=”1″><name count=”1″>hydrochloric acid</name></compound>

<compound pubchemID=”14798″ wikipediaUrl=”sodium_hydroxide” count=”1″><name count=”1″>sodium hydroxide</name></compound>

<compound pubchemID=”24854″ wikipediaUrl=”CaCl2″ count=”1″><name count=”1″>CaCl2</name></compound>

<compound pubchemID=”516892″ wikipediaUrl=”sodium_bicarbonate” count=”1″><name count=”1″>sodium bicarbonate</name></compound>

<compound pubchemID=”222″ wikipediaUrl=”ammonia” count=”1″><name count=”1″>ammonia</name></compound>

<compound pubchemID=”284″ wikipediaUrl=”formic_acid” count=”1″><name count=”1″>formic acid</name></compound>

<compound pubchemID=”5943″ wikipediaUrl=”carbon_tetrachloride” count=”1″><name count=”1″>carbon tetrachloride</name></compound>

<compound pubchemID=”176″ wikipediaUrl=”AcOH” count=”1″><name count=”1″>AcOH</name></compound>

<compound pubchemID=”679″ wikipediaUrl=”dimethylsulfoxide” count=”1″><name count=”1″>dimethylsulfoxide</name></compound>

<compound pubchemID=”516892″ wikipediaUrl=”sodium_hydrogen_carbonate” count=”1″><name count=”1″>sodium hydrogen carbonate</name></compound>

<compound pubchemID=”6456″ wikipediaUrl=”trityl_chloride” count=”1″><name count=”1″>trityl chloride</name></compound>

<compound pubchemID=”3496″ wikipediaUrl=”glyphosate” count=”1″><name count=”1″>glyphosate</name></compound>

<compound pubchemID=”24480″ wikipediaUrl=”MnCl2″ count=”1″><name count=”1″>MnCl2</name></compound></compounds> ounds>

Posted in Uncategorized | 3 Comments

Panton Discussion 2: David Dobbs

The first Panton Discussion with Richard Poynder was a great success. At science online (#solo10) we made some informal contacts and as a result we shall have some more discussions in the near future. There is no fixed format – they are essentially un-discussions in that it depends who bumps into whom and who will be around the Panton Arms. We take a simple recording kit and have a room to ourselves for about 2 hours.

The topic is fluid but should centre on Open Data. It could be a guest asking questions of the OKF, or the OKF asking questions of a person or organization where Open data is an important topic (they don’t necessarily have to be early adopters).

Tomorrow 2010-09-09:12:00 we’ll be meeting with David Dobbs (“Contributor, Atlantic Monthly, New York Times Magazine, Scientific American, Slate, National Geographic, Wired, and other publications”).

As to agenda, I’m mainly hoping to listen and hear from you people about the path and obstacles to more open science publication routes, much along the lines discussed at Science Online, and about what to you seem the best efforts underway to jump (or dismantle from below) the biggest obstacles. 

 

[…]

 

Finally, a key question seems to be whether and how to replace the main functions of current model of peer review — that is, establishment of integrity of method at minimum, a la PLOS, and perhaps a minimal integrity of conclusions from findings at a broader scale. Which parts are most vital to preserve or replace, and which aspects if any are unnecessary?

 

This will again be useful for us to organize our thoughts. We’ll try not to cover the ground we went over with Richard.

Anyone is welcome to drop in. Assuming the series is successful we’ll be appealing for transcribers.

The ground rules are that the discussions are informal, destined for release under CC-BY and so re-usable in articles, editorials, etc without permission but with acknowledgement.

We have 1 definite engagement next week, another promised soon and another likely.

Posted in Uncategorized | Leave a comment