Defending the Public Domain: Open Bibliography


A comment on my latest post on Open Bibliography deserves a full reply

Marius Kempe says:

January 8, 2011 at 7:42 pm  (Edit)

This is very heartening to read. Might I ask for one clarification, which I think you’ve addressed elsewhere but I can’t find: is it actually possible to copyright bibliographic data? Are they not un-copyrightable facts, automatically in the public domain? Or is that in fact true, but we need this effort anyway to combat publisher FUD?

The reason why so much of my effort goes into creating Open tools is exactly this – a mixture of unclarity and default or deliberate FUD.

I believe that many things “are” in the public domain but that many other people think they are not. The problem is that there is generally no simple way of determining the answer. The issue arises mainly from the automatic nature of copyright. If I create a work then the copyright automatically attaches to me. This blog is my copyright. Even if I do nothing it’s my copyright. I don’t have to register it, I don’t have to defend it. Until seventy years after my death (in UK, it varies between jurisdictions) it’s my copyright or my estate’s. Even little bits of it are copyright. If I create a song called “defeding the Public Domain” then that phrase is copyright.

Copyright is generally a civil matter though again this varies between jurisdictions. That means that a breach is not prosecuted by the state, but by an aggrieved individual or organization. If someone violates my copyright then my recourse is to the law. The ultimate decision is in courts with highly paid lawyers – there is not normally a copyright tribunal or office which gives objective judgments.

The copyright symbol does not determine whether or not something is copyright, but it’s a very powerful indication that the person adding the symbol believes they own the control the copyright. Since copyright is a matter of law, violating copyright can be seen as violating law. Most people – like me – believe in the power of the law and have an aversion to breaking it. Therefore if someone claims copyright ownership of something most people will accept that unless they have a direct commercial interest and have the financial and legal muscle to fight it.

Note that if something is “in the public domain” then no-one owns it. Therefore there is no-one to fight for it if someone else claims it is their copyright. It requires a defender of the public domain and this is not easy to get support for. To some extent the EFF and FSF does this for code, but no one does it for bibliography.

So here’s a typical problem. I quote from Wikipedia on

The Dewey Decimal Classification (DDC, also called the Dewey Decimal System) is a proprietary system of library
classification developed by Melvil Dewey in 1876; it has been greatly modified and expanded through 22 major revisions, the most recent in 2003.[1]

Administration and publication

While he lived, Melvil Dewey edited each edition himself: he was followed by other editors who had been very much influenced by him. The earlier editions were printed in the peculiar spelling that Dewey had devised: the number of volumes in each edition increased to two, then three and now four.

The Online Computer Library Center of Dublin, Ohio, United States, acquired the trademark and copyrights associated with the DDC when it bought Forest Press in 1988. OCLC maintains the classification system and publishes new editions of the system. The editorial staff responsible for updates is based partly at the Library of Congress and partly at OCLC. Their work is reviewed by the Decimal Classification Editorial Policy Committee (EPC), which is a ten-member international board that meets twice each year. The four-volume unabridged edition is published approximately every seven years, the most recent edition (DDC 22) in mid 2003.[4] The web edition is updated on an ongoing basis, with changes announced each month.[5]

The work of assigning a DDC number to each newly published book is performed by a division of the Library of Congress, whose recommended assignments are either accepted or rejected by the OCLC after review by an advisory board; to date all have been accepted.

In September 2003, the OCLC sued the Library Hotel for trademark infringement. The settlement was that the OCLC would allow the Library Hotel to use the system in its hotel and marketing. In exchange, the Hotel would acknowledge the Center’s ownership of the trademark and make a donation to a nonprofit organization promoting reading and literacy among children.

Melville Louis Kossuth (Melvil) Dewey (December 10, 1851 – December 26, 1931) was an American librarian and educator, inventor of the Dewey Decimal System of library classification, … Dewey copyrighted the system in 1876.

Here is my amateur analysis of the situation. If you take away one fact it should be that nothing is simple, and much is not algorithmic. Let’s assume that the work was created in the US and that Dewey is dead and has been since 1931. That’s 2010-1931 = 79 years dead. Here’s the US copyright law

How long does a copyright last?
The term of copyright for a particular work depends on several factors, including whether it has been published, and, if so, the date of first publication. As a general rule, for works created after January 1, 1978, copyright protection lasts for the life of the author plus an additional 70 years. For an anonymous work, a pseudonymous work, or a work made for hire, the copyright endures for a term of 95 years from the year of its first publication or a term of 120 years from the year of its creation, whichever expires first. For works first published prior to 1978, the term will vary depending on several factors. To determine the length of copyright protection for a particular work, consult
chapter 3 of the Copyright Act (title 17 of the United States Code). More information on the term of copyright can be found in Circular 15a, Duration of Copyright, and Circular 1, Copyright Basics.

So although Dewey copyrighted the system he’s now been dead for over 70 years so the original copyright has expired. The fact that someone bought the copyright doesn’t affect its duration. (BTW distinguish copyright from trademarks). So the original DDC is in the public domain.

Note that in the US A “work of the United States Government” is a work prepared by an officer or employee of the United States Government as part of that person’s official duties. And …

§ 105. Subject matter of copyright: United States Government works37

Copyright protection under this title is not available for any work of the United States Government, but the United States Government is not precluded from receiving and holding copyrights transferred to it by assignment, bequest, or otherwise.

So, assuming the Library of Congress staff produced their DDC work as part of their official duties (and I’m guessing they did) then their work is in the Public Domain in the US (and by extension elsewhere).

So, at a first reading the DDC is not copyrighted. However my guess is that every new version is freshly copyrighted and that the copyright subsumes the public domain material so that the copyright will be extended indefinitely. You may believe this is a good idea, or you may feel that it is unjustifiable. If the latter, you’ll have to hire a US lawyer, show you have a case (e.g. that you have suffered financial loss) and spend a lot of time and money.

So my analysis is that it’s unclear whether DCC is copyright. In practice OCLC says that it holds the copyright and most people and organizations go along with that whether they want to or not.

It’s trivial to add copyright symbols to a document. I can write © Peter Murray-Rust on this document. Do I have the right?

  • Yes, I wrote it
  • Hang on, you didn’t – you pinched some of it from Wikipedia.
  • Well, yes – but it’s very tedious to acknowledge it. They’re not going to sue me
  • You’ve also pinched stuff from the US government
  • That’s OK it’s in the Public domain
  • I suppose you can do that – but only in the US


By this time everything has become subsumed under my blog.

It’s absolutely universal for content providers to spray copyright symbols on everything – marking their territory. If I ask anyone in academia whether I can re-use it without permission they are all so hexed-out by the magic symbol ©that they will automatically say “no, you can’t use it without permission”. Many of them run in awe or terror of the large content providers, who occasionally sue people. Of course the music industry and the film industry are best known but it also happens in academica and scholarly publishing.

So, Marius, back to your question.

Is it possible to copyright bibliographic data?

Yes – just add your copyright symbol

Is that legal?

It’s not against the criminal law. Find out by hiring a lawyer.

So this is why we are identifying the problem. Pointing out to the community that there is a problem. That the problem costs us hundreds of millions of dollars a year. That as academia we have to start asserting our rights.

And the first step in asserting our rights is to define them.

Now I am hoping that libraries and their bosses will see that this is in their interests. And support Open Bibliography. And start asserting humanity’s right to it.


This entry was posted in Uncategorized. Bookmark the permalink.

One Response to Defending the Public Domain: Open Bibliography

  1. Marius Kempe says:

    Thanks very much! 🙂 I didn’t notice this post until now because you didn’t tweet it. 😉

Leave a Reply

Your email address will not be published. Required fields are marked *