I’m in the Mecca café on Queen Anne in Seattle and in heaven. I’ve been to Seattle quite often – mainly to visit Microsoft in Redmond and we’ve stayed at the Mediterranean Inn on Queen Anne and one block away is the Mecca. It’s been doing real American breakfasts for 80 years and it’s a refreshing change to MacBurger and the rest. I ordered 3 blueberry pancakes – and the server wisely counselled me to have 2 (and even that is more than an average human should eat…) . Free wifi of course.

I’m winding down from an intense week of time-critical demos and talks at the ACS and elsewhere.These are communal projects so lots of people deserve lots of credit. Our group (Sam, Joe, Lezan, David, Nick, Daniel), the OKF (Mark, Rufus, Ben, William, Daniel, Alfredo, Mathias), Quixote and the Blue Obelisk (Marcus, Pablo, Jens, Sebastian, Henry – these are only the most involved in last week) I’ll try to blog these later in detail, but they include:

  • ChemicalTagger. We can now technically read the chemical literature by machine and extract data. But the publishers are actively stopping us.
  • Open Data. The concept is now clear. Two typical concerns: The ACS copyright data (sic). They didn’t create it, they didn’t edit it , I suspect they didn’t even read it. But they stamp it as theirs. So we’ve moving to the situation where we cannot challenge scientific results for fear of being sued. Does no-one else get angy? And there is a cosy cartel where Elsevier, Wiley and Springer feed raw data to the Cambridge Crystallographic Data Centre who then control its active redissemination. Why? Not for any scientific reason but to perpetuate the CCDC’s business model. Result. Half the world’s published crystallography is unavailable. (MEMO: I think I will write to the CCDC board)
  • Lensfield/Quixote. A tremendous push from everyone. Really tremendous. We had to put in place:
    • Parsers and other converters to XML CML. The technology works. It’s simple and could be used in many other areas of physical science. Anyone can develop a parser as long as they understand what the program is actually doing!
    • Conventions. Essentially validatable community-driven agreed practice. Think validatable microformats. What XML should have been before XSD ruined community semantics.
    • Dictionaries. A formal description of what the input and output to codes are. An OWL-free zone, that normal people can understand.
    • Respositories/ Chempound. We now have a working chemistry repository that anyone can POST to or SWORD to. That is indexed through RDF and aggregated through OREChem. This could and should become the de facto approach to managing chemical information in the modern world. It works at a lab level and at an “enterprise” (argh) level and also out on the Open Web Of Linked Data. Which is where most of our data should end up
  • Open Theses. And last night (for me) we ran the first Open Theses workshop. In Vilnius. It was 0300 for me and the skype was bad. But we created a sense of community. And some initial metadata for theses. I hope to get all Murray-Rust theses into this – I think I have 4 so far. There is no reason why the world should not have Open metadata for Open Theses.

So my next self-imposed deadline is demoing Chemppound/ORE at PNNL next week… It has to work and it will work.




    Probably too late, but pop over to the Genki Sushi on Mercer Street. A complete cleanse after the rowdyness and energy of Mecca.

