I don’t normally say very much in this blog about what our day jobs are; now is a useful time to do so. The Centre is sponsored by Unilever PLC – the multinational company with many brands in foods and HomeAndPersonalCare (HPC). It came about through some far sighted collaboration between Unilever and Cambridge to create a Centre where cutting edge research would be done in areas which didn’t just address present needs but also looked to the future.
This is typified by Polymer Informatics – where we have an exciting vacancy. Many of Unilever’s product s contain polymers – you can think of them as long wriggly molecules. They can be very hard – as in polythene, or flexible as in silicones or additives in viscous liquids. Next time you put something on your hair, teeth, face, toilet bowl or laundry, etc there’s a good chance it will have a polymer ingredient of some sort.
Work in my group looks forward to where the world will be in 5 or even 10 years’ time. Here’s a list of some of the technologies in the current position:
OSCAR3, natural language processing, text-mining, Atom, Eclipse/Bioclipse, SPARQL, RDF/OWL, XPath, XSLT, etc.)
What’s that got to do with wriggly molecules? Everything. Science is becoming increasingly data- and knowledge-driven. In many cases the “answer is out there” if only we knew where to look – publications, patents, theses, blogs, catalogs, etc.. We may not need to go back to the lab but can use reasoning techniques to extract information from the increasingly public world of information. And, as we liberate the major sources – scholarly publications, theses, patents – from their current closed practices we shall start to discover science from the relations we find. Open scientific information has to be part of the future.
The phrase Pasteur’s Quadrant is sometimes used to describe research which is both commercially exploitable and also cutting edge scholarship. That’s a useful vision. I have certainly found in my time in the Centre that industrial problems are often very good at stimulating fundamental work. So polymer informatics has taken me to new fields in knowledge representation. Polymers, unlike crystals, are not well-defined but fuzzy – they can have variable lengths, branching, chemical groups, etc. They flop about and get tangled. This requires a new type of molecular informatics and we have had to explore adding a sort of functional programming to CML to manage it. We now have a markup language which supports polymers and several features are novel.
And, as the world develops, the information component in products continues to increase. So we know we are going in an exciting direction.