berlin5 : how to progress Open Data?

I’m putting together some ideas for my talk tomorrow – probably about 25-30 minutes. It’s sometimes useful to set them out in the blog beforehand so I can refer to it as well as the slides.
The audience is roughly:

  • funders
  • librarians
  • publishers (reader-pays and author-pays)
  • governmental and non-governmental agencies
  • researchers (like me)

For background I’m making the broad brush assumptions (and would welcome challenges).

  • green access does not suit a lot of people and there is an increasing movement to insist on gold
  • awareness of the importance of Data is increasing but it still a poor relation to “full-text”
  • although there is quite a lot of activity in institutional digital repositories they won’t (and shouldn’t) address Data. It’s subject-specific and too complex for the average repository manager.
  • eTheses have an increasing importance
  • BBB (BOAI) is a useful political and philosophic utterance but it isn’t a licence. Licences are critical.
  • funders increasingly understand the issues and that they are the most important agents of change

For the last 10 years we have stood still – the eJournal “revolution” has been stultifyingly stagnant. No new ideas for managing information, no new tools for innovation by authors and readers. And the open access publishers have concentrated so hard on the business model they have simply mimicked the commercial offerings. If the rest of the world had been as bad as this we wouldn’t have Google, Flickr, etc. Their mantras – “Take risks and apologize later”. “Just Do it”.
I’ll try not to concentrate on what is broken – at least not in detail. You can re-read this blog. What’s broken is:

  • publishers oppose change
  • licences are non-existent or awful. Publishers do not clarify them, do not reply, create FUD.
  • librarians have been cowed into submission
  • the god of copyright is worshipped to paralysis
  • young people are frightened of experimenting; old people are largely dismissive and antagonistic.

So my suggestions for positive action on Data:

  • all funders should include statements requiring Open Data.
  • subject repositories should be set up
  • CC licences – or similar – should be required to define the actual practice
  • eThesis deposition must be mandatory
  • Open tools will be required and must be funded
  • we must create effective advocacy for all parties: funders, provosts, researchers, repository managers

Advocacy will include self-sustaining demonstrators showing:

  1. short-term archival (“I can get her thesis data”),
  2. re-use (“I can mash his data with mine”)
  3. exposure (“they cited my paper because the robot found my data”)
  4. communities (“I found these other people in this field”)
  5. semantics (“I never thought of looking at the data in that way).
  6. human value (“we can tackle this global problem with this data”)

NOTE: I am publishing this before the presentation so that (a) I can link to it and (b) in case anyone wants to suggest modifications

This entry was posted in berlin5, open issues. Bookmark the permalink.

2 Responses to berlin5 : how to progress Open Data?

  1. Completely in agreement on what’s broken. Would add that the reward structure in academia straitly discourages experimentation, and the finger of blame there is pointed at university administration, whom you did not mention.
    Disagree somewhat that IRs and their managers shouldn’t address data, though I agree that for now it’s impractical because the software is so wretched and the technical infrastructure insufficiently scalable. Just because IR software in its current state is completely broken with regard to data doesn’t mean it must or should stay that way, though. Moreover, the notion that “domain knowledge” is the sole key to data curation is (bluntly) bunk, and nobody’s yet tested the assertion that it’s harder to teach a librarian domain knowledge than to teach a discipline-practitioner info management.
    Frankly, “it differs by discipline” doesn’t matter. So does everything else in librarianship, from reference transactions to collection development. We cope. It’s our job to. As for “too complex,” says who? And about which librarians? I think I’ve just been insulted.
    There’s nothing wrong with telling librarians — and the subset of librarians who are repository managers — that we need to brush up our game to deal with these issues. I have a plan in place to learn the principles of data curation for myself over the next year or so. I want to see more librarians planning the same!
    Looks like a good talk. Wish I could be there to hear it!

  2. pm286 says:

    (1) Many thanks Dorothea. I shall reply to this in full after my talk.
    By “audience” I meant the people in the actual room (you would love it – it perfuses history). There are no university administrators. I wasn’t apportioning blam, though I now shall. Yes, University provosts are the guilty parties. Librarians gave up the torch many years ago – they should have been alerting us to problems, now they are muzzled and frightened.
    IRs I shall address later…
    I am sorry if I have insulted you. I offer apologies which is all I can do. I don’t mean this blog to generate flame wars.
    Yes librarians should be able to deal with this, but they should be sitting next to the fume hoods, the accelerators, the rat mazes, the atmospheric monitoring stations, the hospitals, and they must be able to earn the respect of the scientists. That is a very hard goal, but it’s essential. And I don’t see it happening.

Leave a Reply

Your email address will not be published. Required fields are marked *