Tag Archives: talis

Semantic Chemical Computing

Several threads come together to confirm we are seeing a change in the external face of scientific computing. Not what goes on inside a program, but what can be seen from the outside. Within simple limits what goes on inside need not affect what is visible. The natural way now for a program to interface with other programs and with humans is to use a mixture of XML and RDF. XML provides a voculabulary and a simple grammar; RDF  provides the logic of the data and application.

The COSTD37 group has just met in Berlin  (I blogged the last meeting - COST D37 Meeting in Rome) COST is about interoerability in Comp Chem and it's proceeding by collaorative work to fit XML/CML into FORTRAN programs - at present Dalton and Vamp. We do this by exchange visits paid by COST, wo we are looking forward to having visitors in Cambridge shortly.

It coincided roughly with Toby White's session at NeSC in Edinburgh  on how to fit XML/CML into FORTRAN using his FoX library. I look forward to hearing how he got on.

And then, on Friday, we had a group meeting including outside visitors where the theme was RDF. I was very impressed by what the various members of the group had got up to - five or six mini-presentations. Molecular repositories, chemical synthesis, polymers, ontologies, natural language and term extraction. Andrew Walkingshaw showed the power of Golem which combines XPath with RDF to make a very powerful search tool. We are grateful to Talis for making their RDF engine available and when I have some hard URLs I'll blog how this works.

The main message is that the new technolgies work. Certainly well enough to support collections in the order of 100,000 objects with many triples (Andrew had ca 10 megatriples). We are also making great progress in extracting chemistry out of free text (PDF is still awful, so please let's have Word, or even better XHTML and XML). Or LaTeX. But in any case most of the toolset is now well prototyped. More later...

Open Data: I want my data back!

var imagebase=\'file://C:/Program Files/FeedReader30/\';


Although I am mainly concerned with campaigning for data associated with schoilarly publishing to be Open, the term Open Data has also been used in conjunction with personal data "given" or "lent" to third parties (see Open Data - Wikipedia) which contains Jon Bosak's quote "I want my data back"). Here is a good example of the problems of getting one's personal data (and possibly other people's) back from Paul Miller of Talis: Scoble, Facebook, Plaxo, open data; time for change?. Excerpts (read the whole post for the details)


I am of course talking, like so many others, about Robert Scoble being barred from Facebook for using an as-yet unlaunched capability of Plaxo that clearly and unambiguously breached Facebook's Terms and Conditions.

It all began with a 'tweet' from Robert Scoble, about the time that post-holiday blues kicked in for those returning to work this (UK) morning;

“Oh, oh, Facebook blocked my account because I was hitting it with a script. Naughty, naughty Scoble!”

Twitter exploded, closely followed by large chunks of the blogosphere. ...

Minutiae aside, the whole affair raises a couple of points pertinent to one of the biggest issues for 2008; ownership, portability and openness of data.

  • I want to be able to take my data from a service such as Facebook, and use it somewhere else. That's what Marc Canter has been arguing forever, along with the AttentionTrust, OpenSocial (to a degree), DataPortability.org and many more. That's part of the rationale behind all the work we've been doing on the Open Data Commons, too. However, whether I want to or not, doing it the way Scoble did is a breach of the terms and conditions of Facebook; terms and conditions to which I - and he - signed up when we chose to use the site. If you don't like the terms, don't use the service. It's as simple as that;
  • Even were I allowed to export 'my' data, there's a fuzzy line between that which is mine and that which isn't. The fact that I am a Facebook friend with Nova Spivack certainly should be mine to take wherever I choose. The contact details Nova chooses to surface to me as part of that relationship, however? Are they mine to take with me, or his to control where I can surface them? There's clearly work to do there, although it's interesting that 'even' people such as Tara Hunt are reacting (also on Twitter, of course) with;

“I'm appalled that someone can take my info 2 other networks w/o my permission. Rights belong 2 friends, too.”

PMR: I have no additional comments on this other than to say it's going to take hard work, forethought to anticipate problems of this sort and probably a lot of legal work. Kudos to Paul and Talis and their collaborators for helping in these general areas.


In science it's easy. Our data are ours. They don't belong to Wiley, ACS, Elsevier, Springer. I've just finished a paper on this which you should all see shortly.


We want our data back.


And in future we want to make sure we don't give away our rights to them. Is that a simple message for 2008?



Technorati Tags: , , , , , ,