ACS presentation Part I

Edward Tufte said in his recent book that one shouldn’t use Powerpoint to present information, but Word. Although I am not a fan of Word (see later posts) I agree with the message. So this is the first part of my talk to the American Chemical Society. Don’t worry – there’s not a lot of chemistry.
The title is someting like eChemistry (which would be nice if it actually existsed – we are trying to create it). The abstract is irrelevant as it was written 3 months ago and the world has changed so much the abstract is either out of date or so general it doesn’t matter.
First thanks. When you are likely to run into time problems, thank people at the start. Here are some (please let me know If I have missed anyone – it’s easy to do). Almost everyone on this list has hacked something
Cyberheroes (Mainly Blue Obelisk)
Bob Hanson (Jmol)
Christoph Steinbeck(Cologne)
Egon Willighagen(Cologne)
Tobias Helmut(Cologne)
Stefan Kuhn(Cologne)
Ola Sputh(Uppsala)
Eklund, Martin (Uppsala)
Miguel Howard (Jmol)
Joerg Wegner (Tuebingen/ALTOVA)
Rich Apodaca (Stanford)
Rajarshi Guha
Geoffrey Hutchison (Cornell)
Indiana:
Gary Wiggins
David Wild
Geoff Fox
Marlon Pierce
Symbiote: Henry Rzepa
Cambridge:
Ann Copestake and colleagues
Peter Corbett
Nick Day
Jim Downing
Justin Davies
Richard Moore
Joe Townsend
Alan Tonge
Andrew Walkingshaw
Andrew Walker
Toby White
Sponsors:
DTI, Accelrys, IBM
Unilever
Royal Soc Chemistry, Int Union of Crystallography, Nature Publishing Group
JISC
EPSRC
BBSRC
The current workflow in chemical informatics is broken. A typical scenario is:
nonxmlchain1.png
nonxmlchain.pngnonxmlchain.png
nonxmlchain.png
Here we see legacy programs, human activities and legacy data. At each stage a human has to cut and paste stuff, edit it, etc. This causes loss of time, loss of quality and loss of temper. Wouldn’t it be easier if everything was in a consistent interoperable format like this?
xmlchain.pngxmlchain.png
Here all the data is in XML with semantic markup. human input goes seamlessly into programs, databases, etc. The outputs pass between programs display, etc. with no semantic loss and no friction. XML ontologies add meaning to all information components. The basic components now exist in enough cases that we can build mashed up systems.
I’m going to demonstrate some mashups. Some demos use the Internet, some have been locally crafted. Obviously we can’t demonstrate the 6 month project where we ran 1 million jobs with all interfaces in XML. Here are some of the components:
programs: MOPAC, GAMESS-US, DL_POLY, GULP, SIESTA, CASTEP, METADISE, etc.
editors: Jchempaint, etc.
renderers: Jchempaint, Jmol, JSpecView
Rich Client: Bioclipse
Services: CMLRSS, InChI. OpenBabel
Repository: SPECTRa/DSPACE (CMLCryst, CMLSpect, CMLComp)
Toolkits: CDK, JUMBO, JOELib, CIF2CML, OSCAR3, OSCARDATA
Demonstrations:
Semantic Markup (MACiE)
Simple Mashup: Placeopedia
Chemical Mashup: GoogleInchi. InChI API + Google search API
Semantic data and linking (clickable graphs and tables in CML). Jmol display
Journal-eating robots: OSCAR-DATA (chemical data)
OSCAR3 (chemical text and names) – mashup with PubChem
Reposition of data (SPECTRa) in institutional repositories
CMLRSS: molecular feeds (on Acta Crystallographica)
Rich client: Bioclipse
I shall try to get through all of these in 21.5 minutes – if the connections are slower I may have to omit some. At the end it should be clear that there is enough technology from the Open Source community to take chemistry into the 21st Century.
The next post will cover descriptions and predictions…
——————————————-
Some of these have static URLs and can be viewed relatively easily and robustly
http://www.placeopedia.com
http://wwmm.ch.cam.ac.uk/cryst/summary/acta/e/2006/07-00/ (static repository of Acta Crystallographica CIFs)
http://www-mitchell.ch.cam.ac.uk/macie/ (MACiE) (100 Entries | M0001 | animate reaction – needs IE)
http://wwmm-svc.ch.cam.ac.uk/wwmm/html/googleinchiserver.html GoogleInChI

This entry was posted in chemistry, open issues, XML. Bookmark the permalink.

4 Responses to ACS presentation Part I

  1. Peter,
    to be very correct, I am now PostDoc in Mechelen, Belgium. I am working for a company called Tibotec BVBA, which is part of Johnson&Johnson. Research area is virology, especially HIV and HCV, and related areas;-)
    Joerg

  2. pm286 says:

    Thanks Joerg – I was aware you had moved but wasn’t sure where. I think that’s true for one of two of the others – not everyone has official addresses. They are welcome to post comments.
    FWIW the Blue Obelisk is an excellent opportunity for people in companies to help develop the communal effort. Some companies will find it easier for efforts of indivduals to be seen as contributing to Open source and data. (Others won’t!)

  3. Peter,
    sorry, I just realized this by reading your entry again, but ALTANA Pharma has sponsored my position at university for several years. I do not know if they like to be mentioned, or not? Anyway, I really appreciated their support over those years, and if wanted or not … but the have also sponsored to BO movement … honor to whom is honor due!
    Joerg

  4. This came like a real invention to me, I mean Tufle’s statement about PowerPoint. I sometimes notice that PowerPoint does not always attain its primary goal of enhancing the comprehension of information. But I think thta it is most due to the abumdant use of graphics and flash effects. If to use and create a good high quality presentation that highlights the main points of your talk, I guiess it is still more powerful that Word document in conveying your idea.

Leave a Reply

Your email address will not be published. Required fields are marked *