Open Bibliographic Workshop at #OKCon2011

We’ve just run our Workshop in Open bibliography at OKCon: (Open Bibliographic Data Workshop
by Peter Murray-Rust, Mark McGillivray & Adrian Pohl).

Mark McGillivray has written a great account of the Open Bibliography project, what we have achieved, what the tools are, what the content is, etc at Mark has also kept an Etherpad of the workshop (I’ll post the URL later).

[The breadth of interest at OKCon is enormous and here, for example, is a slice of the program ( There’s science, government, technology, legal, arts, culture, etc…): So many things going on that I haven’t been able to attend many of the star speakers.

Anyway… our workshop was 90 minutes (a good allocation) and well attended. About 25+ people, and interested in making bibliography Open. There’s general agreement on what Bibliographic (meta(data is, why it should be Open. We collected ideas on what the benefits will be and how to make it happen.

So we are pulling together a strategy on how to engage with the various stakeholders and move as quickly as possible to showing that Open Bibliography is possible and valuable. Meanwhile some snippets from Mark’s post :

Open bibliographic datasets




Cambridge University Library

This dataset consists of MARC 21 output in a single file, comprising around 180000 records. More info…

get the data

British Library

The British National Bibliography contains about 3 million records – covering every book published in the UK since 1950. More info…

get the data
query the data

International Union of Crystallography

Crystallographic research journal publications metadata from Acta Cryst E. More info…

get the data
query the data
view the data

PubMed Central

The PMC Medline dataset contains about 19 million records, representing roughly 98% of PMC publications. More info…

get the data
view the data


Products demonstrating the value of Open Bibliography

OpenBiblio / Bibliographica

Bibliographica is an open catalogue of books with integrated bibliography tools for example to allow you to create your own collections and work with Wikipedia. Search our instance to find metadata about anything in the British National Bibliography. More information is available about the collections tool and the Wikipedia tool.

it is possible to create a living map of scholarship, and we show three examples carried out with our bibliographic sets.

This is a geo-temporal bibliography from the full Medline dataset. Bibliographic records have been extracted by year and geo-spatial co-ordinates located on a grid. The frequency of publications in each grid square is represented by vertical bars. (Note: Only a proportion of the entries in the full dataset have been used and readers should not draw serious conclusions from this prototype). (A demonstration screencast is available at; the full interactive resource is accessible with Firefox 4 or Google Chrome, at


This example shows a citation map of papers recursively referencing Wakefield’s paper on the adverse effects of MMR vaccination. A full analysis requires not just the act of citation but the sentiment, and initial inspection shows that the immediate papers had a negative sentiment i.e. were critical of the paper. Wakefield’s paper was eventually withdrawn but the other papers in the map still exist. It should be noted that recursive citation can often build a false sense of value for a distantly-cited object.

This is a geo-temporal bibliographic map for crystallography. The IUCr’s Open Access articles are an excellent resource as their bibliography is well-defined and the authors and affiliations well-identified. The records are plotted here on an interactive map where a slider determines the current timeslice and plots each week’s publications on a map of the world. Each publication is linked back to the original article. (The full interactive resource is available at

These visualisations show independent publications, but when the semantic facets on the data have been extracted it will be straightforward to aggregate by region, by date and to create linkages between locations.


Benefits of Open Bibliography products

Anyone with a vested interest in research and publication can benefit from these open data and open software products – academic researchers from students through to professors, as well as academic administrators and software developers, are better served by having open access to the metadata that helps describe and map the environments in which they operate. The key reasons and use cases which motivate our commitment to open bibliography are:

  1. Access to Information. Open Bibliography empowers and encourages individuals and organisations of various sizes to contribute, edit, improve, link to and enhance the value of public domain bibliographic records.
  2. Error detection and correction. Community supporting the practice of Open Bibliography will rapidly add means of checking and validating the quality of open bibliographic data.
  3. Publication of small bibliographic datasets. It is common for individuals, departments and organisations to provide definitive lists of bibliographic records.
  4. Merging bibliographic collections. With open data, we can enable referencing and linking of records between collections.
  5. A bibliographic node in the Linked Open Data cloud. Communities can add their own linked and annotated bibliographic material to an open LOD cloud.
  6. Collaboration with other bibliographic organisations. Reference manager and identifier systems such as Zotero, Mendeley, CrossRef, and academic libraries and library organisations.
  7. Mapping scholarly research and activity. Open Bibliography can provide definitive records against which publication assessments can be collated, and by which collaborations can be identified.
  8. An Open catalogue of Open scholarship. Since the bibliographic record for an article is Open, it can be annotated to show the Openness of the article itself, thus bibliographic data can be openly enhanced to show to what extent a paper is open and freely available.
  9. Cataloguing diverse materials related to bibliographic records. We see the opportunity to list databases, websites, review articles and other information which the community may find valuable, and to associate such lists with open bibliographic records.
  10. Use and development of machine learning methods for bibliographic data processing. Widespread availability of open bibliographic data in machine-readable formats should rapidly promote the use and development of machine-learning algorithms.
  11. Promotion of community information services. Widespread availability of open bibliographic web services will make it easier for those interested in promoting the development of scientific communities to develop and maintain subject-specific community information.

And more…

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *