I've spent much of last week at the Repository Fringe in Edinburgh (see http://www.repositoryfringe.org/ which has a really excellent "live blog" – almost verbatim; also see #rfringe11 for tweets). It was an interesting event with the normal complete spectrum from geeks hacking repo software and content, to those who are making policy, financing repositories and getting (or not getting) engagement.
The event was very well organized, held in the new Informatics Forum of the University of Edinburgh. Here's Eurovision_Nicola's photo (my hair is just visible (0.6, 0.35) above Mark Hahnel (FigShare)'s light blue left shoulder). This is the roof garden with Edinburgh's Central Mosque and Salisbury Crags (Arthur's Seat) in the background.
(In our group are also Mark MacGillivray (x=0.6), Chen (x=0.73) and Michael Fourman (x=0.77)).
Graham (McDawg and McBlawg) is a prominent and tireless campaigner for Open Access and Open Knowledge. He devotes his own time and money to the cause, and describes himself as
"Scottish International Man of Mystery - Open Science/Access/Data/Knowledge & Patient Advocate"
I have highlighted the "Patient Advocate" as that is what, in large part, has driven Graham to demand Open Knowledge. Graham was a co-founder of the CJD Alliance (http://www.cjdalliance.net/ ) and describes himself in "Patients Like Me" (http://www.patientslikeme.com/members/view/1644) :
Graham has several years experience of obtaining and sharing information between researchers and patients - and now Journals. The patient as always, remains at the forefront - always will.
Graham Steel (42) is a native of Glasgow, Scotland, and works as a property claims adjuster/recovery specialist. Graham's brother, Richard, was diagnosed with variant Creutzfeldt-Jakob Disease (vCJD) in April 1999 and died in November 1999 at the age of 33.
Graham joined the committee of the Human BSE Foundation on a voluntary basis in September 2001 and became a Trustee as of 2003 after the Foundation became a Charity. Since September 2001, he acted as Vice-Chair. One of his main initial and continued foci had been to develop and maintain the Foundation's website. Graham left this organisation in October 2005.
Over the last few years, Graham has devoted much time learning more of the background of TSE's and so called Prion disease, the current and emerging rationale of treatment issues/early diagnostic methodologies and maintaining/seeking contact with many researchers in several Continents. He has also devoted much time assisting in forging links between a number of CJD related support groups from around the world.
So does anyone, anywhere, defend the current system that denies Graham access to the world's published medical literature? If there is a single motivating example for my advocacy and action for Open Knowledge it is exemplified by Graham. (And we and others are continuing to develop positive ways of getting it – watch this blog later in the month/year).
So Graham went to Edinburgh to find out about the knowledge in Institutional Repositories and how it might be of benefit. Here are some excerpts. (http://www.science3point0.com/mcblawg/ )
The positive: On day one, one of the opening presentations was by Mo McRoberts
(BBC Data Analyst) entitled BBC Digital Public Space project. From memory, this lasted for roughly 20 minutes. I [McB] caught up with Mo before he left the event later on and has a great chat with him. Ultra cool guy.
I agree. The singleness of purpose in the Beeb, liberating their content, contrasted starkly with the fuzziness of academia.
The notsopositive: My [McB] question was along the lines of "I've read a number of Enlighten tweets which have links to Manuscripts in the repository and all the ones I've looked at are not open access. I'm a bit confused by this and have been meaning to ask why?"
The general response was that (at least in UK terms) only about "10 – 15%" of the content of these IR's are Open Access. WOW !! I tweeted).
Why the surprise?? Well, from everything that I've read and been told about IR's until that moment led me to believe that ALL of the content of IR's was OA. Nothing at all was indicative to the contrary.
… McRant …
[after] all of the OA Mandates, only 10 – 15% of researchers are self archiving their work into repositories. IR's at least in terms of OA content
(the same cannot be said for non OA content that can be accessed by researchers who have on campus access) do not appear to be particularly effective.
Notice the "researchers who have on campus access". That's the key phrase. It's so easy for those in universities to forget that they have "free" access to all the published literature. (Yes I know that libraries are constantly chopping journals… that it isn't "free" (but academics think it is)). McBlawg and the CJD Alliance do not have access to the literature.
How much of the content in IR's is Open Access? And of what sort (green gold or murky)? Quite simply:
I [PMR] haven't a clue what is in UK Institutional repositories.
And I suspect that no-one else has. If I ask for a list of Theses in UK repositories, people suggest that I write an OAI-PMH harvester. I personally can, but I would much rather that the question was already answered. If I ask how much of the content is CC-BY or CC0 licensed no one has the slightest idea. (and without the licence you cannot re-use the content).
So, UK repositories, I am going to start asking you questions. They are simple to phrase and should be simple to answer. I proposed one as a "good idea" at the RepoFringe. It didn't win a prize, but it did get helpful comments on this blog (http://blogs.ch.cam.ac.uk/pmr/2011/08/04/linked-open-repositories-%E2%80%9Cwe-can-do-it-in-an-afternoon%E2%80%9D/ ). It was phrased in LinkedOpenData language (including the RDF dragon) so I will state it more simply here:
PLEASE GIVE ME A LIST OF ALL THE CONTENT IN ALL UK REPOSTIORIES
I think that's a simple and responsible question. (Please do not tell me I can write software to recursively iterate over the UK repos using OAI-PMH). I want a list. For Graham and others. So they can look at it and see what is and what is not in the UK academic store of knowledge.
Here's my simple arithmetic.
ca 200 UK repositories.
Most have << 10,000 entries (I asked at the meeting).
Soton (one of the most active and also driven from the top) has ca 25,000 (as replied at the meeting)
Several have < 1000 entries (from the meeting)
Assuming ca 2000 entries for 200 repos (power law) we get ca 400,000 items.
[Jim Downing and I have personally put ca 200,000 data items into Cambridge repository. I'm discounting these]
Many repos have been running for 10 years. I'll take an average of 3 year for the 200. That gives 600 repo-years in UK
Let's assume that's 1000 person-years full-economic-costs at 100 K GBP / year = 100 million GBP
Now, before you complain that you don't get paid anything like 100K GBP, that is about the average amount request for a postdoc from a research council. It takes into account the computing infrastructure, mowing the lawns, holding meetings, etc. And on top of that there are those supported to develop software and develop projects. An auditor could reasonably claim that our group had >>200K GBP of JISC grant over the years just for repositories and that's independent of the actual university costs and support.
So my current arithmetic is:
100,000,000 GBP for 400,000 items
That's 250 GBP per item deposited.
QUESTION 1: I need to know exactly on (let's say) August 31 2011 how many entries there are in UK institutional repositories
Assuming I get some answers then I will move onto the next questions. Which are all, ultimately, leading up to knowing how much Open Knowledge do we have and what use is it.