I missed this in the chemical blogosphere and was alerted by Peter Suber: Facilitating the exchange of chemical data
18:14 21/05/2008, Peter Suber,
PubChem has released the beta of its PUG SOAP (Power User Gateway Simple Object Access Protocol). From the site:
PUG SOAP is a web services access layer to PubChem functionality. It is based on a WSDL [Web Service Definition Language]….
PubChem’s PUG (Power User Gateway), documented elsewhere, is an XML-based interface suitable for low-level programmatic access to PubChem services, wherein data is exchanged through a relatively complex XML schema that is powerful but requires some expertise to use. PUG SOAP contains much of the same functionality, but broken down into simpler functions defined in a WSDL, using the SOAP protocol for information exchange….
PMR: This is excellent news. The chemical information web is riddled with human-oriented GUIs, closed interfaces, hidden data with little exposure of content or architecture. This in itself prevents re-use, although most of the sites are anyway not Open.
The bioinformatcs community, by contrast, thrives on Open web services. These have Open Data (re-usable) and Open Architecture. It’s epitomised by the national and international Centres such as NCBI, PDB and EBI and many others. There are literally thousands of Web Services (WS).
This is exemplified in the Open Source Taverna Workflow tool from myGrid (UK, eScience). Taverna has a huge list of bio-Webservices, including access to SOAPLab. The architecure is based on XML. Unfortunately we couldn’t use this approach in chemistry because there aren’t any Web Services. (We and Indiana have provided a few, but nothing compared with bio-).
Although I’m not personally a fan of SOAP (we develop everything using a RESTFul approach) it’s an acceptable architecture. Along with Web Services goes RDF and RSS – these are so much more use than what most web sites provide.
But I’m not surprised, and I’m very pleased. I’ve been saying for some time that chemical informatics is stalled – and the RSC meeting did nothing to change my view. Chemists aren’t interested in information. Chemical informaticists aren’t interested in C21.
So the biosciences – as I’ve predicted – are tooling up to do chemistry properly. Perhaps only those bits of bio-interest, perhaps everything. Who’s doing chemical ontologies? The bioscientists. Who’s doing chemical Web Services? The bioscientists. Who’s doing chemical text-mining? The bioscientists. Who’s doing chemical datamining? Let’s see.
And what are the chemists doing?
We’ll certainly be using PUG SOAP. and perhaps we can work towards a PUG-REST?
Regarding PUG-REST, you might be interested in BioPython.EUtils. However, some aspects of interacting efficiently with PubMed appear to require the server to retain state, so REST mightn’t be possible for all queries (though it is certainly is fine for 90%).
(1) Many thanks. Of course you are right that some services require state. In our efforts we try to minimize state, and I am sure Jim has got some cunning ideas to do this.
As with all things RESTful, if something doesn’t fit at first blush, start to think of it as a resource. The first step is to come up with some representation for the state being held, then you can decide whether to store it on the server as a resource, or to require the client to pass it back and forth as the state is accumulated.
Didn’t PubChem just add a SOAP layer to their REST approach PUG? Although, I am sure there are REST-pure-ists that disagree… 🙂