Industry suffers from Closed Data

I received the following unsolicited mail two days ago from a scientist in a major chemical company [I have anonymised everything so you will have to take my word].

I work [in industry] and am very interested in improving our ability
to mine information from chemical documents.  The work you, Peter
Corbett, and the rest of your group have been doing is of great
interest to us.  As you are aware, much of this information is
locked up in proprietary databases that are highly overlapping (but
none are comprehensive), and even after paying licensing fees, the
vendors make it very difficult to execute data mining and data
integration workflows.  This is especially frustrating since the
data is available to everyone in the community, but is not easily
extracted.  So, we pay fees to read the journal, then pay again to
gain limited access the data in a searchable, structured format
(e.g. [a well known database provider] limits exports to 500 chemical
names).

PMR: I used to work in pharma industry and nothing seems to have changed. It is clear that the anticommons effect is destroying productivity and innovation. It has been estimated that the UK is 500 million pounds worse off because it charges for its maps. The government gets money back from the Ordnance Survey, but that is much smaller than the cost to - say - local government, new media, travel, etc.

In the same way industry is clearly suffering from the restrictive practices of information vendors. This is one of the reasons I am angry about Closed databases, even if they have a free element. It is clear to me that if Chemical Data becomes Open then we shall be better able to develop approaches to disease. I deliberately did not say "develop better drugs", because chemistry is only part of the scientific problem. It is essential to include chemistry in one's knowledge toolkit for understanding disease - humans are chemical objects. All future science-based solutions will include chemistry somewhere in their products and ceryainly in their means of discovery.

How can we take this forward? This is not the first company to make this point. I would encourage others in industry to come forward (if you mail me I will only post with your agreement - indeed I do this with all unsolicited private mail). I have long said that chemical industry should raise the pre-competitive level so that common knowledge was made Open. For example I see industries who are developing their own internal ontologies for science in the public domain. This is a waste since the effort could be shared, and almost counterproductive since the ontologies will be limited and incompatible. That is why we insist on Openness - it works.

I suspect we shall have to catalyze this somehow - perhaps through a real-life meeting.

This entry was posted in data, open issues. Bookmark the permalink.

3 Responses to Industry suffers from Closed Data

  1. baoilleach says:

    Regarding maps...OpenStreetMap.org

  2. Pingback: » The value of information » business|bytes|genes|molecules

  3. Pingback: » The value of information » business|bytes|genes|molecules

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>