APE2008 thoughts on domain repositories

I’m sitting waiting for about 1 million files to transfer from one laptop to another – in the Computer Officer hideout where we have really strong coffee. I tend to twitch about such transfers – rather like a hermit crab – but I can spend the time blogging about APE2008 (see earlier posts APE2008 more thoughts and recursive links from that).
The final session on the last day was about money. I didn’t take notes (no battery left). My impressions were that som new journals can manage on considerably reduced costs – a few hundred dollars. Of course there isn’t a one-size-fits-all – it’s clear that when a journal rejects 90% of submissions their costs are somewhat higher than one with a high acceptance rate. However some publishers spend a lot of money on things that IMO don’t merit it. For example marketing – I remember a figure of 30% (not sure what of) but certainly many domains won’t need that. And tutorials for information products. Should we be needing tutorials on modern products? How many five-year-olds need teaching how to use Google? Or Facebook? If you want help, ask the family. So one message to “author-pays” models is “challenge the costs”. I’m going to stop using “author-pays” and substitute “organisation pays”. The organisation might be a university, a funder, a learned society, a national organisation (e.g. JISC), the publisher themselves in hardship cases, and so on. Few authors pay, and shouldn’t be expected to do so. This is clearly something that academia has to tackle for non-funded non-science subjects.
The  next morning had a session:
Panel Discussion: What Matters? The Future Role of Libraries in Science and Society? Swallowed by OA Repositories, turned into University Presses or kept as Book Museums?
Here I have a problem. I appreciate that libraries have many roles and I’m a keen supporter. Guardianship of scholarship, preservation, access, etc. But this doesn’t come across in science. I see librarians because I’m working on information-rich projects but if I didn’t I wouldn’t. How many PhD chemistry students will come to the library
. (We have a lovely library in our building, funded by Unilever, and students like working there because it’s quiet. But we wouldn’t build the same facility today. And Henry tells me that Imperial has closed its departmental library. They have a nice quiet work area – with terminals – but it’s not a library.  Librarians cannot make a new role out of being super-purchasing and contract officers for information – scientists neither see nor care. So I challenged the panel with this and similar points.
Science and technology move so fast that none of us can keep up. Subject librarians trained on the classical model cannot provide what scientists need. The bioscientists look to PubMed, EBI, PDB, etc as the repositories of knowledge – not to their institutions. What they need are information scientists embedded in their laboratories. People who know how to hack perl, python, Java, XML, RDF, RSS, etc. Where the flow of meta-information is from the scientist to the information scientists as well as the other way round. It’s a tall order. But the average 18-year old does not look in a library for scientific information – they look to Google and Wikipedia (which is why I contribute when I can find time).
Thes views are reinforced by what the biscoientists and physicists are doing. They create domain repositories. They either have large national or international organisations which are beneficient and wish to oversee the free movement of scientific infomation. With bio- it’s Pubmed and Pubchem, NCBI, PDB, EBI, etc. and with physics it’s arXiv and SCOAP3. These are domain repositories and that’s what we critically need.
I can see that certain primary research will naturally go to IRs – mandated fulltext, theses, etc. But  many will see Pubmed and SCOAP3 as the primary places, not their institution. Even where the material is in IRs we need domain metadata tools to extract it properly. (How do you look for a sequence in your IR? a chemical substructure? a spectrum? a partial differential equation?) The problem will be solved in big science. But in long-tail science we need global or national domain repositories and we need departmental repositories for the initial capture. If there are embedded information scientists then that is one of the first things they can be doing to help the community.
… still a few hundred thousand files to go (these are all part of our molecular repository effort). Why’s it on the laptop? Because it fits quite well on planes and trains…

This entry was posted in Uncategorized and tagged . Bookmark the permalink.

3 Responses to APE2008 thoughts on domain repositories

  1. I substantially agree with this (and would love to become an embedded librarian!) — in fact, at MPOW it’s already working quite well in the social sciences.
    I point out, however, that there’s no technical reason IRs and domain Rs can’t replicate and exchange information among themselves — just social, organizational, and not-developed-yet reasons. OAI-ORE will help, if we implement it and we get the social structures and licenses in place. APIs now!

  2. pm286 says:

    (1) Well, we are off to eChemistry tomorrow to start one of the first projects involving ORE. I’m not sure whether we want to replicate except for robustness (LOCKSS-like). I prefer to have single copoies of complex objects as otherwise there is a real chance of confusion (even without versions).

  3. Pingback: Science Library Pad

Leave a Reply

Your email address will not be published. Required fields are marked *