Content Mining Myths 1: "It's too hard for me to do"; no it's easy

One of the many myths about content mining is that it's difficult and only experts can do it.

Quite the opposite - with the right tools anyone can do it. And in fact most of you do content-mining every day...

  • When you type a phrase into a search engine (Google, Bing) ¬†you are using the mined content of the web. You phrase your question to try to get the most precise, most relevant answers. Agreed, it's not easy to WRITE a search engine, but it is easy to use one. If we know what questions you want to ask the scientific literature then we can work out how to build the engine.
  • When you use software to examine photographs it can pick out faces. Again it's not easy to write such software but it's easy to use it. And that's what we are doing for chemistry - recognising compounds and reactions in pictures. We'll present this at the upcoming American Chemical Society meeting in Dallas next month so if you are there you'll get an idea. It's only 3 months old but we've come a long way.
  • When you search your mail for a name you are mining the content. Again it's easy to do.

Because content-mining in science has been held back by restrictive practices there are lots of valuable tools waiting to be applied. That's what we are doing. We expect progress to be rapid. Obviously we'll appreciate direct help, but we'll also appreciate general interest.

What do you want to be able to do? What FACTs do you want to extract (or for us to extract and publish)? It won't all be possible , but a huge amount will be.

And when we have tens of thousands of scientists mining the literature and making the results public there will be a huge acceleration.

 

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>