Bibliographic Data is Open!


Bibliographic Data are the lifeblood of scholarship. They tell us how to find scholarly artefacts and to recognise them when we’ve found them. The journal names, the authors, the pages. They are as exciting as streetnames and housenumbers.

Which are exciting. Maps are exciting and bibliography is the map of scholarship. It’s not the complete map – but the skeleton. The framework to which other properties are added.

And the question I have been chasing for some months is whether they are Open… Can I make a list of bibliographic data and publish them Openly?

Most people I ask mumble. But two days ago Eefke Smit of the STM Publishers assoc rang me and we talked a lot about what was Open and what was not. The problem is that many things are not clear and Open to interpretation. And so it comes down to “it all depends on”.

Which sometimes it does. The problem is that software can’t make that sort of judgment. It works on Boolean Logic – you can, or you can’t. We didn’t resolve all the questions, but Eefke got back very rapidly and here’s her reply.

P: Thank you very much for a full reply. This is very helpful.

I am copying this in to the Open Bibliography list. [PMR] For their background I have been exploring with Eefke and the STM Publishers association whether text-mining was allowable and whether bibliographic data is copyrightable. Eefke gives a clear answer to the second so I am posting this on this list. I think it now makes possible a lot of very valuable things with Open Bibliographic Data.

On Fri, Feb 4, 2011 at 2:34 PM, Eefke Smit <> wrote:

As promised, I would sort out your question about the openness of bibliographies. You made quite clear in our conversation that you are not particularly fond of ‘it depends’ answers. So I fear you may find the following answer slightly disappointing, because also for bibliographies the answer to the question how open they are, depends on what your regard to be elements of a bibliography.


We  have addressed this in “Principles of Open Bibliographic Data”


To start with the simplest elements that are indeed open and considered ‘facts’ hence copyright free: article title; authors of article; journal title; volume-issue information; and dates of receipt/publication. These are all considered to be facts and cannot be copyrighted.


We have essentially covered these in

Core data: names and identifiers of author(s) and editor(s), titles, publisher information, publication date and place, identification of parent work (e.g. a journal), page information, URIs.

I think this is entirely in line with you and your STM colleagues and this agreement is an extremely important step forward.


But nowadays people sometimes include much more into bibliographies, for example images, tables, abstracts, even chemical structures. Bibliographic data can include a number of different kinds of fields and information, including thesauri, classifications like chemistry structures, etc., so there can be some information that is copyrightable or systems that are tied into copyright or trademark protected content. 


Precisely what that is does indeed “depend”. Our list of secondary bibliographic data overlaps greatly with yours. I have highlighted the components that I would believe would be uncopyrightable.

Secondary data: format of work, non-web identifiers (ISBN, LCCN, OCLC number etc.), an indication of rights associated with a work, information on sponsorship (e.g. funding), information about carrier type, extent and size information, administrative data (last modified etc.), relevant links (to wikipedia, google books, amazon etc.), table of contents, links to digitized parts of a work (tables of content, registers, bibliographies etc.), addresses and other contact details about the author(s), cover images, abstracts, reviews, summaries, subject headings, assigned keywords, classification notation, user-generated tags, exemplar data (number of holdings, call number), …

This does not mean that the others were by default copyrightable, but we know of places where people have asserted rights over some of them.

I think you and I differ about whether tables and graphs are copyrightable in this context. I would concede that images which contained creative work were copyrightable but that images representing factual information (e.g. chemical structures) were not copyrightable. For example it would be foolish to be unable to communicate a chemical structure to someone because you might break copyright. There are millions of such images on suppliers bottles and witholding this information means that people could and would die.

I also asked about whom I should contact within a publisher to get a definitive answer from that organization (as most of the time I get no reply).

On your question whom to contact for permissions  as a reader, I would advise you to address the ‘rights and permissions departments’ or ‘licensing departments’ at the relevant publisher houses or else enquire via your local license holder (Cambridge library) who their contacts are. Very often these are regionally assigned, so a general list would be difficult to compose.

This seems to confirm that it can therefore be quite difficult to get the right person within a large publishing house and get an answer.

The STM members can be found on


Hope this information is of help to you,

Yes it is very useful.

Kindest regards, Eefke Smit.

So this is very useful. We agree on this. Bibliographic Data is FREE. As in Speech. Like OpenStreetmap we can start building the bibliographic map of the world.


This entry was posted in Uncategorized. Bookmark the permalink.

One Response to Bibliographic Data is Open!

  1. Pingback: Tweets that mention Unilever Centre for Molecular Informatics, Cambridge - Bibliographic Data is Open! « petermr's blog --

Leave a Reply

Your email address will not be published. Required fields are marked *