e-Infrastructures for Open Science – my talk in Rome

I have been invited to Rome to help start the Horizon 2020 Consultation for future European funding:

http://cordis.europa.eu/fp7/ict/e-infrastructure/docs/agenda_alea_rome.pdf

It’s an important meeting and follows a morning presentation by Neelie Kroes (European Digital Agenda) and responses by National scientific societies.

http://www.allea.org/Content/ALLEA/General%20Assemblies/General%20Assembly%202012/GA_draft_programme_overview_final.pdf

This blog helps me to coordinate my ideas and also acts as a record of them. My remit is to introduce this theme: “Open e-Infrastructures for Open Science” which then devolves into 3 parallel sessions:

  • Open global data infrastructure
  • Open scientific content
  • Open research culture

I’m interested in all of these and shall try to address all of them – of course there is a lot of overlap. I apologize for any UK-centricity but I hope the issues are genera. I bring the following experience:

  • A practising scientist in “long-tail science”, mainly on the informatics side. Heavily involved in the UK eScience programme.
  • Spent time in industry and academia.
  • Active in the Open Knowledge movement (especially science data and open source code/informatics).

I’ll divide this into these areas and probably be slightly controversial:

What have we learned in the last 10 years of eScience?

The UK eScience programme broke much new ground. Its greatest success was bringing groups of scientists and computationalists together and that continues (e.g. in the Oxford eResearch Centre) and that has made it eminently worthwhile. But I’ll also comment on things that didn’t work:

  • Top-down design. Technology progresses so rapidly in the Internet world that trying to design the future doesn’t work.
  • Academic-industry infrastructure. The problems of shared vision and a secure collaborative infrastructure are too difficult and expensive for either partner. Instead we should concentrate on areas where industry can share the results of academic work through an Open Infrastructure
  • Universities are generally not the best place for managing collaborative research infrastructure on an ongoing basis. Institutional repositories do not effectively serve science. In contrast inter/national research organisations have the infrastructure and the mission to make this happen.
  • There was and is very little investment in infrastructure for “long-tail science” – I exclude bioscience supported by EBI/NCBI, etc. There are no useful repositories for many of the disciplines, few ontologies, and little interest in the dissemination of science
  • Academia and many scientists are conservative and increasingly driven by self-interests. Open practice will not happen rapidly.

What are the current problems?

The eScience program has had little impact on the current practice of science. Informatics is carried out using whatever commodity tools are available and the culture is dominated by commercial scientific publication. This has not changed in 10 years and is now seriously holding back innovation in several ways.

  • The result of research is a “PDF”, not scientific information
  • The rewards are almost solely based on “citations” – a flawed measure of value
  • Almost everyone outside academia (and many within) is denied effective access to scientific output “the scholarlyPoor”.
  • Young researchers are stifled by the system and institutionalised.

There is little incentive to change the system or to build a better infrastructure.

And alongside this we have the battle between commercial closed “walled gardens” and Open knowledge (CC-BY, CC0 – anything other is almost valueless). Academia is NOT committed to Openness – it points inwards and builds systems for itself, not the world. And there is a dysfunctional academic-publisher complex which reinforces stagnation.

What are we losing?

We can consider this both in world terms and European terms. There is now huge potential in new information industries downstream of scientific publication “Google for Science”. I have estimated to the UK Hargreaves enquiry that in chemistry alone this could be “low billions” worldwide. Are we going to let Silicon Valley capture yet another new market?

  • We lose the value of the research we fund.
  • We lose the opportunity of creating new information industries
  • We make seriously bad decisions
  • Our science is worse – often unchallenged or duplicated
  • Or culture does not reward change.

What are the growing points?

We must not ignore the rest of the world. Our greatest human capital is OUTSIDE academia. Examples of worldwide growth are:

  • Wikipedia etc. (probably the greatest communal effort to build quality public science in many disciplines)
  • Open Streetmap. An unfunded project that shows what one person can make happen and within a few years become a word resource and standard
  • Open Source software.
  • Open Knowledge
  • Open science (Open Source Drug Discovery). Open Science moves faster than conventional because it grows communities rapidly, shares knowledge and avoids mistakes.
  • Internet-aware interest and practice groups (e.g. Malaria World)
  • Young people. One graduate year can create a high-quality growing point: Figshare, Altmetrics, and PMR group (OSCAR/OPSIN chemical NLP, Crystaleye – all now being taken up). Give undergraduates and graduates encouragement to explore and innovate
  • Open publishers (PLoS, Wellcome, BMC(Springer))

What should we do?

We have to change the culture. I don’t know how to do that in detail, but here are some things I’d like to see happen.

  • A scientist-oriented system for scientific research. “ScienceForge”. It’s been solved for computer programming (“SourceForge”) without any central investment. It can’t be top-down; it has to grow organically. It has to support scientists in their daily work so naturally that they don’t notice it. A scientist should then be able to share their work anywhere. It should support embargoed publication, on scientists’ terms not publishers.
  • 3rd year graduate students designing the informatics structure and training
  • Use multiple metrics for science output not just “citations”
  • Actively involve the “scholarlypoor” outside academia. Reward successful extra-academic enterprise
  • Put national laboratories at the centre of the infrastructure for long-tail scientific information.
  • Develop sustainable profitable business models based on Open practice.

I’ll think of some more on the plane. And I shall, as always, react to what is said before me. I am very impressed with Nellie Kroes and hope I get a chance to meet

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *