With ContentMine you can now mine 100 papers/minute

I have been silent on this blog for many months, not because I had nothing to say, but because ContentMine is saying it in software. In short, ContentMine is a new approach to extracting knowledge from the literature, but using technology that anyone can use. And that means anyone – not just academics, but citizens: school students, doctors, local government, conservation, patient groups, social enterprises… Anyone.
 
We presented this at two workshops last month.
 
Firstly a meeting of plant scientists at the The_Genome_Analysis_Centre  (TGAC) in Norwich, which also included John_Innes_Centre  (JIC) and Sainsbury laboratories. These organisations are committed to the use of science to improve plants and agriculture, and knowledge is an increasingly critical part of this research.
 
A week later  at the Cochrane  annual meeting. “The [Cochrane] group was formed to organize medical research information in a systematic way to facilitate the choices that health professionals, patients, policy makers and others face in health interventions according to the principles of evidence-based medicine.[4][5]” (Wikipedia). The primary basis of Cochrane is reviewing already published medical and related work and giving a systematic and objective analysis.
 
Both of these groups were very receptive to the idea of Open mining of the current literature. It doesn’t remove the need for humans – rather it allows them to work on the precise areas where humans are essential and most productive.
 
With mining techniques we can make the (Open) peer-reviewed literature available to anyone. You – and we mean you – can download 100 papers in a minute and analyse them for scientific concepts. It is astonishing what mining reveals literally within minutes. It goes beyond traditional search engines such as Google because the software picks out the common threads in the papers. It is a compelling demo of the value of Open, of mining, and also the accessibility of the taxpayer-funded research to the taxpayer.
 
The success of the approach depends on three main resources:

  • The Open scientific literature, especially through EuropePubmedCentral which has over a million Open papers.
  • Wikimedia, which includes Wikipedia and Wikidata. This is now becoming my first stop for trustable science and I’ll convince you it should be yours.
  • And http://contentmine.org where we have developed Open usable simple powerful technology for linking all this together.

 
There’s a 5-minute video (https://www.youtube.com/watch?v=5lYzOZ2Cv_I ) based on exploring knowledge about Zika that shows how this works.
 
The workshops have shown that you can now install the software yourself if you wish to. You need to be generally competent in installing a range of programs and using the commandline. I’ll cover the details in later posts.

This entry was posted in Uncategorized. Bookmark the permalink.

2 Responses to With ContentMine you can now mine 100 papers/minute

  1. Oliver Bandel says:

    Cool project/software.
    Nice video.
    But the last time I looked at ContentMine-stuff, the installation procedure would need a lot of effort. Automatic installation via package-installers (for Linux-distributions) would be nice (and IMHO necessary).

Leave a Reply

Your email address will not be published. Required fields are marked *