How contentmine will extract millions of species

We are now  describing our workflow from extracting facts from the scientific literature on . Yesterday Ross Mounce and I hacked through what was necessary to extract species from PLoSone. Here’s the workflow we came up with:

Ross has described it in detail at and you should read that for the details. The key points are:

  • This is an open project. You can join in; be aware it’s alpha in places. There’s a discussion list at!forum/contentmine-community . Its style and content will be determined by what you post!
  • We are soft-launching it. You’ll wake up one day and find that it’s got critical mass of people and content (e.g. species). No fanfare and no vapourware.
  • It’s fluid. The diagram above is our best guess today. It will change. I mentioned in the previous post that we are working with WikiData for part of “where it’s going to be put”. If you have ideas please let us know.


This entry was posted in Uncategorized. Bookmark the permalink.

One Response to How contentmine will extract millions of species

  1. Pingback: How contentmine will extract millions of species – ContentMine

Leave a Reply

Your email address will not be published. Required fields are marked *