SpotOn2013: Yet another wonderful meeting

The last two days I have had a wonderful, if exhausting time at SpotOn (known in the past as Science Blogging and Science Online). Ross and I were organising a session called “What the hack?!” about running science hackdays. Sophie Kershaw (now Kay), Panton Fellow, ran a session on reproducible science using Lego, and Salvatore Mele (CERN, library) gave a sensational keynote on the Higgs Boson. But much of SpotOn is meeting people, and in particular I have very useful contacts and discussions relating to The Content Mine (more later).

And met many people with new ideas. There’s an ever increasing dissatisfaction with the status quo in formal scholarly publishing and communications. Universities are becoming metrics factories, driven by large corporates. New ideas get commercialised and used to reinforce this process. Privacy and autonomy of scientists is under serious threat. (more later). So I was pleased to find people who want to challenge that. One in particular comes from the disadvantaged periphery of European science (Spain, Greece, and others) and is now effectively priced out of mainstream science. They can’t buy journals any more and they cannot pay for “Gold Access” (nor, for that matter, can I). (More later, when, I get their manifesto).

“What the hack?!” had 5-minutes from Martin Fenner (also an organiser of the whole event) and Helen @DeckOfPandas Jackson. I’d met Helen at NHS Hack Day last weekend and immediately realised what she could bring. She gave a sparkling presentation, full of passion, good sense, experience, in a style where she wastes no time getting the message across. She was also invaluable in picking up questions and discussion. Unfortunately our session was only 40 mins, but I thinks everyone was able to contribute or be enlightened.

Ross and I had assumed that most people would be familiar with hacks. Actually only about 25% knew what one was and fewer had been/organized. So whereas we had planned to structure the discussion so all delegates could get a chance, in fact it was a free form Q and A session. For one delegate hacks include physical hacks/making and I’d re-inforce this. Hacks are enhanced by (say) cakes, knitting, and toys [Chuff came to our session although Chuff is not a “toy”].

And that moves into Sophie’s brilliant workshop session on “Making Research Useful: The Consequences of (Bad) Communication” with Lego (see http://sophiekershaw.wordpress.com/2013/11/09/solo13lego/ . Sophie told us that the first 6 months of her DPhil were wasted because the paper she was feeding off misreported the (computational ) experiment (starting conditions were’nt described). So she organized us into 5 groups of 10. Each group was to build a Lego microscope and each had instructions.

We were told that the instructions were slightly wrong and slightly ambiguous. No group could talk to another group and we had 35 minutes.

It was wonderful activity. I won’t give details as Sophie will be repeating it many times – read her blog, not me. Finally 5 microscopes were created. (I think ours was the worst). They were all different:

This is one of the best workshops ever.

It could be taken into a primary school, a management course, a programming course, and of course University. I would make every undergraduate do it (probably a 3 hour session and Sophie has many more ideas). It emphasizes formal written communication between people in different places and different times. It stresses modularity and structured work.

I know Sophie spent many evenings working on it. That shows that even brilliant ideas need hard work.

I hope that many of you will see opportunities and followup with Sophie.

 

 

Posted in Uncategorized | 1 Comment

SpotOn #solo13 #solo13hack “What the hack?! Science Hackdays”

I should have blogged this earlier …

Today 2013-11-08:1030-1130 UTC Ross Mounce and I are running a 1-hour session “What the hack?! Science Hackdays” at SpotOn 2013. SpotOn has been running for ?6 years and used to be called variously “Science Blogging” and “Science Online” (hence the #solo13 tag). But there has been a name clash and it’s now SpotOn – so it goes. It’s at the British Library but I think it’s sold old – have a look. It’s two days, with fringe events (in pubs, where else?) and probably 250 ppl. Wonderful!

Ross and I have invited two outstanding speakers to kick off our session:

  • Martin Fenner, now with PLoS. Martin helped run the recent hack4ac hackday (1-day, first time)
  • Helen Jackson. CellCountr , helped to run the recent NHSHack at Cambridge (2-day, 5th in series)

Both were fantastic events. There’s a lot of contracts between an initial 1-day hack and a 2-day (or longer) hack in a series. The organizers are always committed and work much harder than it appears. The delegates come with very varied expectations and there are very variable outcomes (almost always positive).

We’ll be online, streamed (I think). If you are online in the time slot you can contribute. The tag is #solo13hack. The day will work like this:

  • We’ll start at 1035 exact, regardless of whether people are seated. 1 hour is very short.
  • ALL verbal contributions are brief. The speakers get 300 secs (5 minutes). Delegates on the floor get 60 secs max. PMR gets ca 20 secs purely for admin and to announce Helen. After the time limit, Chuff the Okapi, gets bored at , like Miss SweewtiePoo at the Ignobel awards starts seriously misbehaving. Don’t let it happen to you!
  • I shall run an Etherpad which will be display6ed except during Martin/Helen. Anyone can contribute. We’d like a spontaneous volunteer to act as scribe.
  • I shall feed Twitter and listen for questions. Anyone REMOTE should add [REMOTE] to any question they want asked. PMR will try to intersperse real-life discussion with remote questions.
  • Ross will introduce the session and Martin.
  • Martin will speak.
  • I shall introduce Helen.
  • Helen will speak
  • We’ll then take contributions from the floor and remotely. Maximum of 60 secs for everyone. Every so often Chuff will ask a question from REMOTE.
  • It will be fast-moving so don’t blink. Ross may steer the real-life discussion if required. I shall do the twitter questions.

We’ll capture the tweets. If you don’t get to speak your contributions on Twitter will still be great.

There’s no set questions and *we* don’t want to drive it. You can raise anything – what subjects, where, cost, size of hack, length of hack, continuation, etc. We may ask for show of hands from time to time.

And *we* can’t predict how it will work out – that’s part of the spirit of a hack.

Posted in Uncategorized | 2 Comments

Shuttleworth application: How the Content Mine is going to work

The second half of the Shuttleworth application asks how you are going to make it happen. Here’s my proposal. But if I am successful, I know that the Foundation and its fellows will be able to give advice and mentoring and I would expect the details to be continually improved… [Note- I have not formally asked all of the organizations – simply that I shall be contacting them. And there are many others which I haven’t had space to include and they shouldn’t feel “left out”!]

 

What do you want to explore? To deploy a framework (AMI) where scientists can create their own “plugins” to extract facts and enhance their understanding of publications. I/We have created chemical tools that “think as a chemist” and in narrow fields (e.g. understanding chemical names) are already better than all but a few experts. Ross Mounce (Panton fellow) and I are now starting on biodiversity where we have prototypes than can recognize species in papers, check them against known taxonomies, and publish collections in milliseconds. Simply by extracting all species, places and dates from the last 10 years of scholarly publications (10 million papers?) we make a massive open contribution. This can be used for scholarship, policy making, and outreach to help create citizen scientists.

 

As we ramp up, we’ll feed the results into organized semantic collections such as Wikipedia (which is becoming a primary reference for science), its offspring DBPedia which allows semantic querying of WP, and EuropePubMedCentral where our extracted facts (chemicals, species, etc.) will help create a far better index.

In many ways search engines control our thoughts. By building our own, better, scientific search engine we shall recapture our autonomy of thought.

 

What are you going to do to get there? I’ve just attended a 2-day boot camp in Software Carpentry (SWC) (http://software-carpentry.org/)run by Greg Wilson and been very impressed with the way it was run and the social dynamics. Much of my strategy is now based on his experience.

 

I and my group have built the software framework AMI. It works, but lacks user stressing and innovation. I have “acquired” a good 3rd year PhD student who is putting some of the final pieces in the framework, chemistry and numerical sciences. In some subjects AMI will answer questions well and usefully; in other fields we’ll see the way forward but need collaboration. Initially we shall have workshops concentrated on subject areas I know (bioscience, chemistry, publishing) and move into new ones with contacts from Greg and other scientists. I also expect to interest Open Access Publishers (PLoS, BMC, etc.) in running workshops to explore semantic publishing – this can be a source of sponsorship.

 

SWC runs “boot camps” with (say) 30 attendees who learn so much that some of them want to become instructors and run their own boot camps; this is a goal for when AMI is widely deployed. In the medium term, towards the end of the fellowship, I would expect to start running bootcamps in selected subjects.

 

BACKGROUND

 

Have you started implementation of the idea? Yes. The software framework has taken 5-20 years to develop and is now deployed as beta versions, with tools for (a) chemistry/metabolism and (b) biodiversity (phylogenetics). There is enough at beta that a new enthusiastic community could build a plugin within a month or two – with Greg I intend to explore this in (c) astronomy. Other tractable domains are bio-sequences (proteins and genes) and possibly environmental science (geolocation).

 

How have you funded your initiative in the past?
The code has been funded in part by scientific research grants (RCUK, JISC, Microsoft, Unilever, CSIRO/AU) and by volunteer contributions (mainly in the margins of existing research). Two modest grants from EPSRC (“Pathways to impact”) were critical in getting several tools released. But I have also harnessed the power of volunteer communities where energy and mutual respect are strong currencies. The key task now is to expand our communities – so that subject areas will each take on their own tasks.

 

I proposed Panton Fellowships (PF) 3 years ago (OKFN) and raised funds from (a) OSI(OSF) and now (b) CCIA. The Fellows are spreading the idea of bottom-up scientific knowledge and the initiative should become self-sustaining (not dependent on me) in a year or two. I helped “write a grant” (technically I’m not a Co-I) for BBSRC to support Ross Mounce (PF) for the AMI phylogenetic work at Bath and this will have a major impact in technology and community. Similarly I’m working with Kitware (a software company in Albany NY) for them to get grants in semantic chemistry.

 

Who are your current or potential key partners? I’m initially looking for organizations that would run joint workshops or fund them. I have excellent current contacts with: (a) community: OKFN (School-Of-Data), SWCarpentry,
Tabula , Crowdcrafting, MozillaScience (b) libraries and repositories: BL, EuropePMC. (c) publishers: PLoS, BMC, Ubiquity (Sam Moore is a Panton Fellow). (d) companies: Kitware (NY), Figshare. I have good links with Wikip(m)edia which could become very important long-term and am exploring Natural History Museums. I’ll be doing the first demo at the Oxford eScience Centre at the end of next month (2013-11). But I expect other potential partners to emerge from the workshops we’ll be running.

 

Not-for-profit. After talking with Greg Wilson and Francois Grey and Daniel Lombrada-Gonzales G I see no immediate need for a Foundation. Greg’s bootcamps are usually funded by a host (e.g. group of universities) who provide resources and modest sponsorship. Like Greg I don’t have a sense of personal ownership, but I absolutely want to protect against a digital landgrab by a big corporate. After 6 months we’ll have a clear idea whether The ContentMine generates its own identity or is naturally part of something else.

 

Where will you be based?
, UK. I currently travel abroad about once a month and love reacting to invitation. London is a great digital centre and I go there at least once a week so some events may be London-based.

 

Do you have an online presence?
http://www-pmr.ch.cam.ac.uk/wiki/Main_Page summarizes my science, but is not updated. My main structured presence is my blog /pmr which is active and widely followed. I don’t have a classical home page (spam). http://www.blueobelisk.org is one self-sustaining community I created. On Twitter @petermurrayrust has ca. 2000 followers. I’m also visible at http://pantonprinciples.org/ and http://en.wikipedia.org/wiki/Peter_Murray-Rust and http://stackoverflow.com/users/130964/peter-murray-rust.

 

Does the idea/project have an online presence? https://bitbucket.org/petermr/ hosts about 10 repositories directly involved in the project, the most accessible is https://bitbucket.org/petermr/svg2xml-dev. I’ve deliberately not created a project page until we can show a working (alpha/beta) system because I hate vaporware and you only get one chance to release. (I have bought contentmine.org). I’ll be working on tutorials for 2013-11-27 in Oxford. Ross Mounce has also blogged the AMI project and content mining – http://rossmounce.co.uk/2013/10/06/setting-up-ami2-on-windows/

 

The political aspect is frequently covered in my blog, by RossMounce, and by OKFN . A particular example is http://blog.okfn.org/2013/02/28/content-mining/ where a wide group of organizations withdraw from the (heavily lobbied) EC attempt to require licences for mining. I also blog on OKFN: http://blog.okfn.org/2012/06/01/the-right-to-read-is-the-right-to-mine/.

Posted in Uncategorized | Leave a comment

My Shuttleworth application and manifesto

The Shuttleworth process asks applicants to answer four questions. “Please think about how your idea relates to technology, knowledge and learning and how your idea relates to openness when answering each section.

 

  • Describe the world as it is. (A description of the status quo and context in which you will be working)
  • What change do you want to make? (A description of what you want to change about the status quo, in the world, your personal vision for this area)
  • What do you want to explore? (A description of the innovations or questions you would like to explore during the fellowship year)
  • What are you going to do to get there? (A description of what you actually plan to do during the year)


Here are my first two answers… The first is close to a manifesto and the second addresses the social and political aspects. The key feature is that web-based communities and democracy are now sufficiently common that we can hope to use that culture.

 

Describe the world as it is. We’re in a battle for our digital future. The Web has released huge creativity and spread digital democracy but at the same time large vested interests – governments and companies – want control, compliance and conformity. Change comes by people developing new tools, the creation of communities, the spread of knowledge and the emerging vision of a better world. Companies and governments also see the potential of the Web; often using it well but also often repressively. Many, especially the “content” industries, feel threatened and use legal and technical means to restrict access and innovation.

 

300 Billion USD globally funds Science, Technology Medicine (STM). Much of this directly benefits the planet – health, climate, development – and also supports informed decision-making. But 80+% of the published output is paywalled and only rich Universities can read it. Scholarly publication is largely controlled by companies who try to retain the status quo. With some notable exceptions, they forbid free re-use of the information, try to re-license it, and effectively prevent the innovation seen in other fields (journalism, commerce, and government). Billions of research dollars are wasted as science rots in PDFs and behind paywalls.

 

The system is broken, technically and morally. Young people hate it, but only a very few have yet found ways to change it.

 

What change do you want to make? To build a community which frees data and builds tools for better, Open, STM communication. This will empower readers, both in and outside traditional academia, by creating an enhanced semantic environment. I have the Quixotic expectation that this will change the culture of scientific information and create a self-replicating movement. The technology will then spread because it is immediately useful. (An example is Figshare for collecting data, invented almost by accident and after 2 years now a major player).

 

Disrupting the central control exercised by large commercial publishers will create an expectation that readers deserve better tools and data, and can create them themselves. Using machines to liberate facts from copyright publications is, in fact, legal, but it’s very rare because people are frightened by lawyers. There’s also much publisher FUD (including “PMR (sic) will publish all our content”). The UK government will support content mining in 2014-04 by implementing the Hargreaves recommendations (which I have strongly argued for and supported). This is thus an ideal time to start deployment on a large scale. It will both demonstrate the value and also create an unstoppable wave of liberation (as has happened in many creative industries).

 

The traditional science publishers, like traditional music publishers, have held back innovation. I am set on changing that.

Posted in Uncategorized | Leave a comment

My Shuttleworth application

The next few posts (possibly interspersed with other issues) will contain full details of my application for a Shuttleworth fellowship. Here I present my argument/manifesto [see below] for why we need action and why I can address it. And first some history.

 

For background, Mark Shuttleworth has changed the world through Ubuntu and is funding other ventures to change the world through fellowships. I applied for one about two years ago – prompted by a colleague. I hadn’t thought about it, though of course I knew that Rufus Pollock, founder of OKFN had been awarded one. And so I thought “why not?”.

 

Well there was a lot of “Why not!” I was in the middle of Washington State (not DC!) at Pacific NW laboratory, and found I had to make a video. I was 9 hours out of sync with anyone who could help on the ground. I couldn’t find local help. So I tentatively asked Jenny Molloy – in Oxford UK. Those who know Jenny will know what a tremendously unselfish and helpful person she is. She makes things happen with apparent ease and is always smiling.

 

So we made a video out of bits that already existed. Some recent video of Jenny and me at Open Science Summit (“closed access means people die”). Some stills of me and collaborators at Beyond the PDF 1. Some (compelling) footage of Open Access in Africa, relating to Leslie Chan. It was hairy. Jenny would get an hour or two in Oxford Comp Service and stitch bits, add captions. I can’t remember where the audio came from – maybe not at all. We had ca 5 days where we could send one message each way each day due to timezones! Not the best way to create a video!

 

It got submitted just about on time. I then went back to UK. Karien (from Shuttleworth) mailed and said they’d like to have a clearer idea of what I was going to do and could I submit another video. Gulp! I couldn’t possibly impinge on Jenny again. Somehow by magic I discovered (a) a video camera and (b) that Windows had something called MovieMaker. Now I am not a great Windows fan. And MovieMaker is relatively basic (read: if you edit it the wrong way the audio and video lose sync. Read: if you move your files, the whole thing self-destructs). And the output (WMV) is basic, bloated and otherwise horrible and doesn’t play on Unix by default. So I have to make another movie, edit it and create an MP4. I made it – not proud of it or the project idea – and started to create MP4s. At this stage I discover that this can be complex. I have to understand codecs and deltas and … machine crashes. I’ve forgotten the name of the converter – it is/was the standard but it has a zillion options (like any good tool) and no defaults (at least not that worked). I think I finally got a volunteer (?Cottage labs?) to convert it. Remember that every edit needs re-exporting as WMV, converting to MP4 and then uploading to Dropbox or wherever (at that stage I am not sure that we published the applications more openly).

 

Then I go to Australia. I have another interview in the middle of the night. The WiFi is on a rented dongle which is very flaky. But it goes fine. I didn’t get an award and I’m not downcast because (in short retrospect) the project probably wouldn’t have taken off.

 

But the process had great value. It helped to shape my ideas, convince me that I have stuff to offer the world, and probably most importantly shaped the way we run part of the Panton Fellows. When we came to interview them I suggested we should ask for videos and the Panton Board agreed. It makes all the difference to forming an impression of someone and their project. And even for those who don’t get selected it’s very valuable experience and knowledge.

 

We’re moving to a world where we live in public. Shuttleworth have. I’m delighted with that, because it’s the way I work as well.

Posted in Uncategorized | Leave a comment

Scholarly publishing: The week my life changed; and yours will too

In the last 2 weeks everything has come together and convinced me that the old era of scholarly publishing is on the verge of collapse. The new ideas of web-based community, of freedom and democracy, and of semantics are irresistible. They’ve been bottled up by gateways, lawyers and an older generation who conforms and hangs onto the past rather than challenges.

That is finished. And I am playing a role in that – history will tell whether it mattered.

There is so much going on that I’ll simply list the last two weeks. Each item requires one or more blog posts. I haven’t had the time (and WordPress/server let me down). In order of date…

EuropePMC. We’re gearing up to showcase contributions from you. I’m certainly expecting that what I describe below will become available through EuPMC.

Panton Fellows. I visited one of our new Fellows, Rosie Graves, in Leicester. She’s going to show how citizens can take part in science (air quality). [Contact her]. I’m going to meet the others tonight. The Panton Fellows (Sophie Kershaw/Kay, Ross Mounce) are already changing the world.

Software Carpentry. This is massive. Greg Wilson (see video later) has run a revolutionary approach to creating software. People kill to get on the bootcamps – I was able to attend a wonderful one at Greenwich. Greg’s ideas of how to run things are applicable to many other new ventures including my own.

MozFest. Mozilla isn’t just a browser, it’s an organisation that is changing the world through building the web. The current web is stale, dominated by companies. Mozilla is creating a fresh approach which emphasizes democracy, meritocracy, citizens [and furry animals].

Shuttleworth Fellowship application. I then spent the week writing an application for a Shuttleworth Fellowship . It’s for “Anyone who has an innovative idea for social change through fresh thinking that adds value in the areas of knowledge, learning and technology. Anyone who has a clear vision of how the world can be a better place and the contribution they can make to bringing about the change.” I qualify. You have to make a VIDEO, here’s mine (the Content Mine https://vimeo.com/78353557 ). (BTW Shuttleworth WANT people to share their processes openly, and I’ll publish more).

The message is simple. There are hundreds of millions of critically important FACTS locked up in the STM literature. I now have the technology to liberate them from behind APIs, PDFs and other archaic forms of communication. I hope to liberate a 100 million facts a year. That’s not hyperbole – our group have already liberated millions of facts and the scale up is done through people and machines working together. Much More later.

And last night I heard I had been selected for a (skype) interview! I expect to blog this daily if I get any time…

NHSHack. Two days in Cambridge. Simply: the best hack I have been to. 80-100 committed people who want to make the NHS a better place for patients, GPs, hospitals. Within 2 days 14 groups (I lost count) had come up with wonderful applications. Not ideas, but ideas that work. Examples:

  • PracticeMinder. Applying Hans Rosling’s wonderful GapMinder to evaluate doctor’s Practices (UK for surgery). [BTW Hans is on BBC tonight and do-not-miss him].
  • LGBTQ analysis of Practices and Hospitals. Which are sensitive to these issues? Such a simple idea and so valuable.
  • Where’s that clinician/registrar/xxx? A way of instance communication in hospitals through mobile devices.

It was so good I have kidnapped Helen Jackson to present to us at…

Spoton. A massive yearly gathering of anyone interested in science communication – blogging, journals, hacks, etc. Ross Mounce and I are running a 1-hour session on how to run a hack with Martin Fenner and Heln presenting. Follow us on #solo13hack. We’ll take questions for anywhere on the planet.

Open Access Button. Two medical students who are angry about paywalls and are doing something about it. They are running a Thunderclap which will hit the whole twittersphere with their message. I’ve donated my effort to help this. You can too. They need hackers for a week or two

AMI/TheContentMine/XHTML2STM. Literally 1 hour ago I reached a state where I feel I can justify the claim:

“Machines can read the whole scientific literature and understand significant parts of it”.

The technology is there – the crawlers, the PDF parser, the graphic object synthesis, the creation of words, phrases in all areas of the “paper” including images.

We’re building the community. In Cambridge we have built intelligent readers for chemistry in the literature. In Bath we are building the same for biodiversity. Greg will put me in touch with astronomers…

Ross and I are off to discuss all this with BMC this afternoon.

The only thing standing in our way is the legacy scholarly publishers. What I/we are going to do is not illegal. Facts are not copyrightable. And very soon in UK we will be able to tear up the contractual clauses that forbid content mining.

Tempora mutantur, nos et mutamur in illis. Two thousand years ago and it’s still true: The times they are a changing.

You should change with them.

 

 

Posted in Uncategorized | Leave a comment

My current excitements and involvements

I wrote an hour-long post and WordPress has destroyed the whole lot.

Posted in Uncategorized | Leave a comment

Problems in Open Access: we need regulation in the broken market

In the last post I highlighted success of the Open Access initative and culture over the 10 years since BOAI. This post highlights a fundamental problem of scholarly publishing and the “market”. Please criticize me – lack of criticism is one of my concerns. Unfortunately there is very little constructive discussion of OA – it’s limited to isolated blog posts like mine, and the factional GOAL list. However unless the OA movement(s) recognizes and addresses  serious shortcomings OA will remain marginal in many areas and of no benefit to the world outside academia and its markets.
I’ll use the term legacy publishers to represent the large established closed publishers (Nature, Elsevier, Springer, Wiley, T+F, etc.). OA publishers are those with effectively total OA content from inception(BMC, PLoS, eLife…) and legacy . Scholarly societies are caught in the middle. Many are struggling and some do deals with legacy. I’ll omit them – sadly – and fear that many will be crushed in the next few years or end up being appendages. Societies must re-examime their purpose and cannot assume thay have an income stream from publishing while retaining scholarly freedom.
As I write this post it becomes clear that we must look at the overall picture of the scholarly publishing market. Open access currently is about 5-10% (gigures are extremely difficult – counts by journal make it higher, counts by article or revenue lower. The global total market is about 10-15 Billion USD. That’s about the size of the rail network in UK. It’s a lot of money and most of it comes from research grants and student fees. OTOH apparently Harvard’s library bill is less than the maintenance of its campus so it’s not at the top of Vice-Chancellor’s concerns.
So who are the players in the market?
The legacy publishers. (I differentiate between OA publishers (BMC, PLoS, eLife…) and legacy (Nature, Elsevier, Springer, Wiley, T+F, etc.). Scholarly socities are caught in the middle, many consumed by or doing deals with legacy and I’ll omit them.  Large legacy publishers are now modern corporations, excutives can earn over a million dollars, and many middle managers have no grounding in academia or science or have lost it many years ago. Think of legacy publishers as you think of your bank, your energy supplier, the train operators – the corporate culture. Do you trust Elsevier more than you trust Barclays Bank? Do you trust the American Chemical Society more than you trust Amazon? For me it’s fairly even.
Legacy publishers have huge amounts of money to spend on lobbying – much of this is done in secret. Their argument to governments is “look what a lot of revenue we generate for you”. This is powerful – we still sell tobacco, though the helath costs exceed the revenue and few politicians can solve this problems except of decades. I suspect scholarly publishing has features in common.
The authors. Authors have little say in the market. Scholarly publishing has increasingly become a means of chasing glory defined by large brands and the academic system forces them to chase this – to conform. I have the luxury of not having to conform but I don’t expect aspiring young scientists to break ranks – I applaud them when they do and I will promise to man their barricades. But by and large they conform. And they are the producers of the goods, which they give freely and which are misappropriated downstream. In principle they have the power to change the system by mass action, and this has been tried (Tim Gower’s boycott of Elsevier briefly caused a dip in share price). I think this was the original idea of Stevan Harnad’s Subversive Proposal – authors should self-archive their manuscripts voluntary and expose them on the web. If this had been done 10 years ago it would have succeeded and the authors /academia would control the market. The #scholarlypoor would be able to read the output of scholarship in the rich North. But it hasn’t happened and it isn’t going to happen and no amount of exhortation to Green-archive will have now have any effect. Increasingly, therefore, authors do what they are told – by tenure committees, funders and by legacy publishers.
The universities. There are probably about 2000 research-publishing universities globally – it’s a long tail. They control the purchasing – the 15 Billion USD. In principle purchasers have enormous power in a market and in principle Universities could change the market dramatically. If 2000 Universities said “from next year we will [publish all our output as Green] [refuse to accept embargoes of > 6 months] [set a maximum APC of 1000 USD] etc ” the publishers would have a hard struggle not to accept. It might take a court case – and it’s symptomatic of OA that they have never tested the legal boundaries of what their actual rights are. (Is a copyright transfer actually legal – I suspect no in many domains and republishing papers open might infringe author rights but not publishers). But Universities have been universally supine. There have been a few mandates, generally of the form “you must [self-archive]- unless you don’t want to/ publisher won’t let you” and these have been worse that useless – they have demonstrated that universities have no teeth or are afraid to use them. The major reason is that universities, who are both consumers and producers are required to compete against each other and are a Holy Roman Empire of feudalism. Any other market would have seen rationalisation, but Universities (for good reasons) have been chartered to be individually independent and self-sufficient.
The readers. They have no purchasing power and are almost treated with contempt by legacy publishers and universities. Most #scholarlypoor don’t read the fruits of scholarship – teenagers like Jack Andraka are forced to beg their parents to funds to read medical papers.
Funders. Funders recognise the value of exposing their funded work and are requiring authors to make it open access – in some way. They have real power and are trying to use it. But they are constrained by having to change a 15 Billion market with very active opposition from legacy publishers. If attempts are too bold the government-based funders get shouted down on Capitol Hill or the House of Lords
Governments. Funders also face non-compliance from researchers – I think Wellcome’s mandate (which is absolute and sufficiently funded) is only 55-60% obeyed (and we must help to change this through monitoring). Funders are also very coherently aligned – the practice may differ (mainly due to national fighting from legacy publishers) but the overal goals are united.
OA advocates. A lot of effort has been put into OA by many organisations and individuals. And I applaud them for their energy and dedication. But they are not united, with little clear strategy and poor long-term goals. (Will #oaweek change the world?)  This has to change if they are to be taken seriously and I’ll address this in the next post.
My primary observation on the Scholarly publishing market is it’s unregulated. Banks, trains, energy all have some form of regulation in UK. There are formal constraints on what train operators can charge (though in UK they aren’t very effective). But in scholarly publishing publishers can do anything. They can charge what they like for Hybrid Gold APCs (an appalling system) and may do – 5000 USD. They can licence papers as they like. Nature have a differential charge for CC-NC vs CC-BY. They can appropriate content for their own reuse (Springer collects all the diagrams in its papers and resells them at 50 USD each). Several OA publications have ended up behind paywalls.
It’s appalling that no-one except a few activists challenge the current unregulated market practices. Universities are happy to hand over 15 Billion without anyone checking what they are getting. This has to change.

Posted in Uncategorized | 4 Comments

#oaweek: The successes of #openaccess

My previous post outlined some of the differences between #openaccess and other Open initiatives and was, by implication, somewhat critical. In this post I’ll list some of the things that are successes or going well for #openaccess. In the next I’ll contrast this with things that are serious problems or failings.  I welcome criticism and may amend my position – one tragedy of OA is that useful debate is stifled by factionalism (I’ll discuss this later).
So here’s my list of successes. (By implication important things that are not on the list (e.g. repositories) have serious problems).
1. Recognition: “OpenAccess” is now widely recognized as an issue within important parts of the community. It’s part of the political agenda and cannot be overlooked (it may be deliberately ignored). Open access has roughly the following actors, and I’ll expand below:
* publishers. All publishers are intensely aware of it.
* funders of research. Again almost all funders – both government and charity – are highly aware of OA.
* government. OA is frequently on parliamentary and legislative agendas
* university managements. All are highly aware of the issue. Many, but by no means all, academics are aware of OA.
2. OA publishers. The brilliance of Vitek Tracz’s BioMedCentral showed that OA could prosper in the marketplace. Not enough people recognize this and all OA advocates, whether favouring “green” or “gold” (terms I deprecate and will discuss later) should give unfettered praise. BMC started with an apparently mad idea – ask authors/universities to pay for publication rather than publishing for free in conventional journals. This paradoxical strategy is very hard to sell and it required Vitek’s brilliance (and personal capital). BMC got all the important things right and many have followed.
* quality. Any new journal struggles against established brands and there could have been a tendency to shade quality. However BMC journals stressed quality and I am proud to be on the Ed Board of one) have standards as least as good as their legacy equivalents.
* price. BMC prices are largely affordable. Yes, it’s real money and from a personal pocket it’s a lot, but many chemicals and reagents can cost as much as the APCs.
* brand. BMC has a coherent brand. (And #animalgarden have embraced @GulliverTurtle).
* outreach. BMC has actively promoted aspects of #openaccess = running meetings, organizing competitions, supporting projects, etc. so that the human and technical infrastructure of #openaccess has been enhanced.
* innovation. BMC was relatively conventional apart from the market model. Later OA publishers have innovated significantly, especially PLoS. PLoS introduced the mega-journal PLoSONE which deliberately accepts solid useful but not necessarily dramatic science. It’s probably the largest impact in publishing innovation so far. Journals such as BMC’s Gigascience are also succeeding in innovation (data journals).
* regulatory processes. Recently the OA publishers have set up OASPA, the OA publishers’ association, which monitors quality of parts of OA practice. It’s an effective protection against “predatory journals” which have low quality, and very dubious practices. I would hope and expect that OASPA will offer some form of certification.
3. Funders. Huge credit goes to the Wellcome Trust – Mark Walport, Robert terry and Robert Kiley. Because Wellcome is independent of government it can make its own policy and has done so. Wellcome proved that funders could have a coherent, workable policy for requiring that their funded work was published openly, and they have constantly pressed for BOIA-compliance. Wellcome effectively set the rules for other funders to emulate, so that RCUK, Europe and many others have seen that the process can work.
4. Governments and other policy makers. Open Access is now an important political issue. It’s argued to have considerable benefits – that funded work which is universally visible brings economic and moral/political rewards. Governments making all funded work public are providing important resources to the world. In the UK, for example, there have been commissions (Finch) and debates in the Houses and similar issues are debated in many other countries. The EU, under the inspired leadership of Neelie Kroes, has insisted on Open Research in Europe.
5. Public infrastructure. There’s a modest, but not sufficient amount of infrastructure to support Open Access. Funders include JISC in the UK, SURF in NL, and there are useful initiatives such as DOAJ (directory of Open Access journals). I applaud these but there’s nowhere near enough and University investment in repositories has been fragmented, wasteful and almost completely ineffective.
In the next post I will outline some of the failings of OA, and then in the final post list issues that need to be addressed.

Posted in Uncategorized | 1 Comment

#openaccess 10 years on; can we say "This is for everyone"?

[This is probably the first of several posts]
This is OpenAccess Week #oaweek #oaw13 and I generally try to post something to give a perspective. It’s 10 years since the Budapest Open Access Initiative (BOAI) which I thought was a wonderful initiative and fully supported (in so far as my support matters). It reads

By “open access” to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself.

This is a marvellous political vision and a useful implementation guide. It is heavily influenced by the freedoms in software (http://en.wikipedia.org/wiki/The_Free_Software_Definition ) written over 25 years ago:

The word “free” in our name does not refer to price; it refers to freedom. First, the freedom to copy a program and redistribute it to your neighbors, so that they can use it as well as you. Second, the freedom to change a program, so that you can control it instead of it controlling you; for this, the source code must be made available to you.

It has influenced the OKFN’s Open Definition (http://opendefinition.org/) 2004

“A piece of data or content is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and/or share-alike.” –

In 2003 I assumed that the signatories of the BOAI would act in united fashion to make the spirit and the letter of BOAI reality. I thought it would change the world. I pledged myself to help insofar as I could.
It hasn’t changed the world and I am now not sure what the role  of individuals such as me is..
That’s a strong statement. Some will argue that we must be patient and we are making good progress. I don’t take that view. I also ask “progress towards what?”
The world is capable of moving at an speed unimaginable 25 years ago. I am inspired by collective efforts such as the WorldWideWeb,  Wikipedia and the Human Genome (see http://en.wikipedia.org/wiki/Bermuda_Principles)

These “Bermuda Principles” (also known as the “Bermuda Accord”) contravened the typical practice in the sciences of making experimental data available only after publication. These principles represent a significant achievement of private ordering in shaping the practices of an entire industry and have established rapid pre-publication data release as the norm in genomics and other fields.
The three principles retained originally were:

  • Automatic release of sequence assemblies larger than 1 kb (preferably within 24 hours).
  • Immediate publication of finished annotated sequences.
  • Aim to make the entire sequence freely available in the public domain for both research and development in order to maximise benefits to society.

The genome projects could have worked in a closed environment, keeping data to themselves, patenting, gaining personal credit. They didn’t; just as TimBL didn’t patent the WWW. They made everything freely and openly available immediately. “This is for everyone.”
http://en.wikipedia.org/wiki/Tim_Berners-Lee
Technically we could have made #openaccess available to everyone for a fraction of the cost that publication costs us now ($15 B USD). We could have transmitted STM scholarship (I’ll concentrate on Science Technology Medicine) to everyone connected to the Internet.
Instead, 10 years on we have schoolboy genius Jack Andraka asking his parents to pay access charges (30 USD per paper) for medical papers. #openaccess is effectively a closed community where little reaches out beyond academia and – for the large part – academia doesn’t care. The single thing that is working is #openaccess in fully BOAI-compliant journals such as BMC, PLoS, eLife, PeerJ, Ubiquity, etc. and the support of an increasing number of governments and funding bodies. The BOAI-compliant OA is about 2-20% of the publication output depending on field and who you listen to. (Hybrid OA is a waste of money and so are university repositories – they have failed to make any impact on the wider world. I have yet to find a mainstream scientist or someone outside academia who regularly uses Green OA.)
Is 20% useful? It depends on what you want to do. For some of my work (bioscience) it’s quite useful. For other aspects (e.g. chemistry) it’s useless.
And if the currently slow progress continues where will we end up? There’s very little clear vision and a great deal of fighting and confusion.
In the next posts I will try to compare OA with other Open Initiatives and suggest what could be done.
 
 

Posted in Uncategorized | 2 Comments