Crystal26 – what I said – the Crystallographic Semantic Web

As usual I didn’t know in detail what I would say at Crystal26 – it depends on who is present, what has just been said, how grateful I am to the organizers (10/10). I have an overview page (in HTML) and a menu of a few hundred topics with 10-20 “slides” each. In particular I download chunks of HTML from the web rather than try to emasculate them with powerpoint. This makes it difficult to distribute a “talk” and so I try to blog the major points.

Overview of presentation:

  • The Semantic Web is here and ICT companies are investing heavily
  • Vision of universal access to knowledge for both humans and machines
  • Belief in emergent human/machine phenomena
  • SW already well developed in bioscience
  • crystallography very well placed in physical science

I do strongly believe in the nascent Sematic Web leading to a new phase of knowledge devlopment and sharing. Some obvious areas will be the continued development of natural language tools and – as others think – a new generation of knowledgebases – beyond Google. Current contenders include Wolfram Alpha and True_Knowledge. Little is known in practice of either – I would guess that WA would have major applicability to physical science, while TK seems to be closer to aSemantic Web approach, but with fuzzy algorithms rather than the formalism of RDF/OWL. We shall see. But at present I’m guessing it’s still worth trying to create semantic documents with controlled ontologies.

Semantic web:

  • linked Open Data
  • (global) reasoning engine
  • social networks

I showed my Tweetdeck as an example of social networks – I have become converted to this as a valuable way of throwing medium-valuable contributions or request to a like-minded community. For example I tweeted my request for support material and was picked up by various members of the direct and informal groups. However I concentrated on Linked Open Data (including the Open) but as always didn’t cover it all – I present material until the time cutoff and then stop. (Whereas a Powerpoint requires you to flip over lots of slides to get to the end).

Expectations:

  • global access to data in standardised or interconvertible form
  • “natural language” questions linking data
  • “giant global brain”

Requirements:

  • Open data
  • Agreed semantics
  • identifier system
  • agreed ontologies
  • ontology mapping
  • authoring tools
  • searchable repositories
  • validation systems

Crystallography does well implicitly in these areas but it needs for formalising. Data is often not aggressively Open, semantics are not in modern formats, ontologies are implied at best (through CIF dictinaries). Data creattion tools are patchy – the good news is that manufacturers are generally on board, the bad is that there is no semantic editing or authoring. Repositories do not exist except as highly managed, expensive data banks. This has to and will change

Challenges

  • getting semantic data
  • publishers’ attitudes (not IUCR/Acta)
  • creation of ontologies

Tractable approaches

  • Open semantic authoring tools
  • searchable Open Data (RDF) repositiories

So then some demos. I showed Andrew Walkinshaw’s geo-crystal mashup, CrystalEye, CIF-CML-RDF-OWL in Protege, semantic authoring in Chem4Word and finally a movie of Lensfield molecular repository (Nico, Joe, Jim have made great short movies of what the system looks like).

And, as always, we are keen to collaborate in these areas. My great thanks to Peter Turner for hisx invite and support.

This entry was posted in Uncategorized. Bookmark the permalink.

One Response to Crystal26 – what I said – the Crystallographic Semantic Web

  1. Glad to hear you’re still getting some value out of that visualization! Sounds like a great conference.
    By the way, the physicists doing genuinely ab initio DFT (using functionals parametrized from free electron gas models like rPBE) have generally regarded it as at best semi-quantitative; generally stucturally insightful but error-prone. There’s a lot of research into post-DFT methods (quantum Monte Carlo, GW methods) and alternative formalizations (LAPW and the like), particularly for higher derivatives, dispersive interactions, and accurate extended band structure; LDA and GGA functionals get the band gap wrong in opposite directions.
    DFT, regardless of the functional, for example, gets the dynamics of water totally wrong (my former colleague, Marivi Fernandez-Serra, has published on that in PRL).
    Interesting to see that different communities of theorists have such differing opinions on the same technique!

Leave a Reply

Your email address will not be published. Required fields are marked *