Peter Corbett and the OSCAR3 award

Today was a sad and happy event in that we said goodbye to Peter Corbett. Peter has been the chemistry lead in the SciBorg project and has made major contributions to understanding chemical documents and chemical language. He has developed the OSCAR3 program which many people (citation needed…) regard as the leading tool for chemical entity identification and extraction. In simpler terms OSCAR3 can analyse a document (as long as it’s not some awful bitmap or grunged PDF) and identify the chemical words and phrases. Peter has also written on the linguistic science of this it’s fairly easy to identify the word pyridine but this isn’t enough. Peter identifies at least 3 uses of the term: the bulk substance (a bottle of pyridine), a part of a molecule (pyridine rings are aromatic) and a molecule itself (the pyridine molecule has C2v symmetry). He’s written at length on his latest blog post about this.

OSCAR wasn’t the primary scientific reason for the Sciborg project but Peter found time to develop a major tool. This is now being refactored by the OMII group so that it can be run standalone, as a service, as a component in a pipeline, as a chemistry checker in a word processor, etc. So it was natural to honour this when Peter leaves us.

So here’s Peter’s OSCAR. It’s labelled:

Peter Corbett


Unilever Centre, 2005-2009

Peter is taking up a position in Linguamatics, a Cambridge-based company with activities in text-mining and other things. I am always proud when people leave us with positive motivation and it’s important for the future of the UK that this type of work flourishes because it will generate a lot of wealth in the coming decades. (And the UK could do with some wealth).

Peter loved linguistic challenges especially with ambiguity (time flies like an arrow can be parsed as fruit flies like a banana). Another is the conjunction of (nounal) adjectives (pretty little girls school). So I described him as

A pretty large Unilever Centre language processor domain expert

which will keep most tree-banks busy for a bit.



This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *