Archive for September, 2007
- Things that make me scream: RDF “QNames” followed by RDFa, sure, but now? Jeni shows that it’s not me that’s a mess, it’s RDF. If I’d read that 2 weeks ago I’d have saved 2 days. Danny Ayers has required semantic web reading each week.
- Linked Data on Wikipedia
- The Semantic Naturalist – “musings on natural history, geography, and the semantic web”
- UMBC Semantic Web research mentioned in the NYT
- Yahoo! News on Semantic Report
- Podcast : Rohit Khare on syndication-oriented architecture see also SynOA What?, “piles of junk“
- GRDDL to W3 Recommendation
- GRDDL – main specification doc
- GRDDL Test Cases – demonstrate the expected behavior of a GRDDL-aware agent
- GRDDL Primer – introductory material
- GRDDL Use Cases – example applications of the technology
- Simplifying RDFa Notation
- Turtle @base
- “The World is now closed” – danbri is no longer a teenager
- Symbol languages and the Semantic Web
- Deploying Linked Data (PDF)
- Understanding SWRL (Part 3)
- An initial sketch of a KR for Semantic Web agents (as Logic Programs)
- Proving a URI is a document
- Soccer schedules, flight itineraries, timezones, and python web frameworks
- The three kinds of platforms you meet on the Internet – Google to join the club?
- Salesforce.com’s Platform as a Service
- Neno/Phat Architecture, plus presentation (PDF)
- Lingvoj and hubjects also Gopher URIs for FOAF – 303s, strangeness and charm…
- Sweet Tools (Version 10) – 577 Semantic Web (and related) tools Exhibited, intro
- Semantic Eco-blogging: Spotter 1.0 Released
- Sioku – Jaiku to rdf converter
- Simple Widget Markup Language (SWM) – “is a framework to create HTML pages with rich graphic elements as if they simply were extra HTML elements. The additional components are wrapped versions of the GWT [Google Widget Toolkit] widgets or can be created using GWT or SWM itself.”. See also: RDF2Go.
- Facebook FOAF generator – see also Querying Facebook in SPARQL
- Tying FOAF identity with the identity semantics of OpenID
- Y!Mash, SquidWho
- RDFa Distiller
- Search SUMO via WordNet
- mod_atom gets a HTML template language
- Opera 9.5 alpha
- Triple-I 2007 Wrapup, Web 2.0 meets Web 3.0
- Introducing Quaere – language integrated queries aka LINQ for Java
- Basing the Design of History on the User’s Memory, see also Trailblazer, #swhack discussion
- Places – “is designed to be a complete replacement for the Firefox bookmarks and history systems.”
- Jottit – “makes getting a website as easy as filling out a textbox.”
- The Database Column
- freedb data dump
- MapReduce in a Week
- Command prompt as an IM session with my computer?
- We have lost control of the apparatus
In the Media
Quote of the Week
Everything is a platform!- dull-looking character in Dilbert
PMR: This has come just in time. I have offered to give a talk in 2 weeks time to Cambridge Corporate Gateway Event on “the semantic web”. The audience is concerned (I think) with how new technologies emerge and may be exploited. So I need to find some impressive Semweb stuff. I’ve now got Danny’s stuff to read – should keep me busy. And Paul is coming to talk to us in the Centre. Unfortunately it’s the next day, so I can’t re-use his material. Help!See also: tagged but forgotten… ~ Sources include Planet RDF, various other blogs, Semantic Web Interest Group IRC Chatlogs & Scratchpad, ESW Wiki, SemWebCentral, Sweet Tools, W3C Semantic Web Activity, mailing lists, personal emails etc etc. If you see anything suitable this coming week, please mail meor use the del.icio.us tags “semweb weekly” – thanks!
- (taken from Wikipedia on Euler) was beautiful, while
- ex = Σ(&infty;n=0)xn/n! was ugly.
PMR: First of all many thanks to funding legal work on Open Data. Whatever else we have to remain within the legal framework or we court disaster at a later stage. There will not be a single approach to this anymore than there is a single Open Source licence. Motivations vary and, even more importantly, data is more varied than software. I know of two other efforts, Science Commons – (in Cambridge US) springing from CC, and the The Open Knowledge Foundation set up by the tireless Rufus Pollock (in Cambridge UK) who invited me to be on the board. We honour this by using the OKFN “Open Data” on our own CrystalEye. I expect that people will choose different licences to emphasize different policies. (For example I currently use Artistic as my software licence as I don’t want the name JUMBO to be misused for derivative works which are not compliant. I might well use BSD elsewhere. and so on). As Paul says, please converse.18:11 24/09/2007, NodalitiesIn the world of creative works, notions espoused by Lawrence Lessig and others over a number of years are becoming increasingly well understood. A Creative Commons license, for example, is recognised as giving the holder of rights an ability to prospectively grant certain permissions rather than limit use of their work by expecting all comers to request these permissions, again and again. Those rights are not cast aside, removing all opportunities to protect your work, your name, or your potential revenue stream. Rather, you are provided with a means to explicitly declare that your work may be used and reused by others in certain ways without their needing to request permission. Any other use is not forbidden; those uses must simply be negotiated in the ‘normal’ way… a normal way that also applied to those uses covered by Creative Commons licenses before the advent of those licenses. Creative Commons licenses are an extension of copyright law, as enshrined in the legal frameworks of various jurisdictions internationally. As such, it doesn’t really work terribly well for a lot of (scientific, business, whatever) data… but the absence of anything better has led people to try slapping Creative Commons licenses of various types on data that they wish to share. It will be interesting to see what happens, the first time one of those licenses needs to be upheld via a court! At Talis, we have an interest in seeing large bodies of structured data available for use. Through the Talis Platform, we offer one means whereby such data may be stored, used, aggregated and mined, although we clearly recognise that similar data may very well also be required in similar contexts. Recognising that contributors of such data need to be reassured as to the uses to which we – and others – may put their hard work, we spent some time a couple of years ago drafting something then called the Talis Community Licence. This draft licence is based upon protections enshrined in European Law, and has been used ‘in anger’ for a while to cover contributions of millions of records to one particular application on the Talis Platform. There has been plenty of talk around ‘open data‘ here on Nodalities, and on our sister blog Panlibus. See, for example, this recent post from Rob Styles. There were also fascinating discussions at the WWW2007 conference earlier this year. Despite interest in open (or ‘linked‘) data, licenses to provide protection (and, of course, to explicitly encourage reuse) are few and far between. Amongst zealous early adopters, there does seem to be a tendency to either (mis)use a Creative Commons license, to say nothing whatsoever, or to cast their data into the public domain. None of these strategies are fit for application to business-critical data. Building upon our original work on the TCL, we recently provided funding to lawyers Jordan Hatcher and Charlotte Waelde. They were tasked with validating the principles behind the license, developing an effective expression of those principles that could be applied beyond the database-aware shores of Europe, and working with us to identify a suitable home in which this new licence could be hosted, nurtured, and carried forward for the benefit of stakeholders far outside Talis. Today, Jordan posted the latest draft of this license (now going by the name ‘Open Data Commons‘), some rationale, and pointers to various ways in which he – and we – are seeking input and further validation. As my colleague Rob (again!) has argued, curators of data need an option on the permissions continuum between free-for-all and locked down. The Open Data Commons, née Talis Community Licence, offers that option. Take a look. Think about how you would use it. Consider what sort of administrative framework you would want behind such a license. Join the conversation.
For example, the most important structural feature of Diazonamide is that it’s a nonribosomal peptide, which is denoted by the suffix “amide“.PMR: it might have started as a peptide but I don’t think many people would now call it that. (Unless there is another Diazonamide that I don’t know of). So on to the latest synthesis (Magnus, Cheung, Goldberg, Russell, Turnbull and Lynch. JACS, 2007, ASAP. DOI: 10.1021/ja0744448.), remembering I can’t read the full text. The abstract is a superb illustration of hanging links (NullPointerExceptions in Java):
Abstract: During the course of studies on the synthesis of diazonamide A 1, an unusual O-aryl into C-aryl rearrangement was discovered that allows partial control of the absolute stereochemistry of the C-10 quaternary stereogenic center. Treatment of 30 with TBAF/THF gave the O-tyrosine ethers 31 and 32 (1:1), which on heating each separately in chloroform at reflux rearranged to 33 and 34 in ratios of 84:16 and 56:44, respectively. This corresponds to a 70% yield of the correct C-10 stereoisomer 33 and a 30% yield of the wrong C-10 stereoisomer 34. Attempts to convert 34 into 33 by ipso-protonation and equilibration were unsuccessful. Confirmation of the stereochemical outcome of the rearrangement was obtained by converting 33 into 37, an advanced intermediate in the first synthesis of diazonamide A by Nicolaou et al. It was also found that the success of the above rearrangement is sensitive to the protecting group on both the tryptophan nitrogen atom and the tyrosine nitrogen atom.PMR: What a splendid piece of non-communication! [My comments could apply to many publishers, not just ACS]. Without the full text (which, after considerable perusal will tell us what 1, 30, 31, 32, 33, 34 and 37 are) it’s almost meaningless. I am reminded of Alice’s comment on Jabberwocky:
“Somehow it seems to fill my head with ideas – only I don’t exactly know what they are! However, SOMEBODY killed SOMETHING: that’s clear, at any rate — ‘”PMR: and the authors made something from something else… So off to Pubchem. Many compounds made by synthetic chemists are no in Pubchem because they are of no interest, but Diazonamide is. It has a structural diagram  InChI=1/C40H34Cl2N6O6/c1-15(2)27-37-46-29-32(54-37)40-20-9-5-8-19(18-7-6 -10-22-25(18)26(33(41)43-22)31-34(42)48-38(29)53-31)28(20)47-39(40)52-24 -12-11-17(13-21(24)40)14-23(35(50)45-27)44-36(51)30(49)16(3)4/h5-13,15-1 6,23,27,30,39,43,47,49H,14H2,1-4H3,(H,44,51)(H,45,50)/t23-,27-,30-,39-,4 0u/m0/s1/f/h44-45H The problem is that this is not pretty for blogs as it runs over the line ends and spaces are a problem. So IUPAC are working out new approaches and some of these are discussed by the Blue Obelisk. There is also a SMILES: CC(C)C1C2=NC3=C(O2)C45C(NC6=C(C=CC=C64)C7=C8C(=CC=C7)NC(=C8C9=C(N=C3O9)C l)Cl)OC2=C5C=C(CC(C(=O)N1)NC(=O)C(C(C)C)O)C=C2 which is a linear way of encoding the structure. Let;s go to the Daylight site (they invented SMILES) to see what it looks like: I think it’s correct, and it’s certainly a lot better than the Pubchem offering but it’s not beauty – except for Shrek. Let’s try Chemical Abstracts. It’s got every compound ever made. Maybe they will let me have a free go… (STNEasy) I find: A free demo! Just what I wanted… PMR: This is fine, and it points to the same abstract, but I can’t get at the structure. Let’s try CAS-Number lookup – it will tel me the number and the structure… and there is a free demo as well: Oh dear… Yes, a free demo, but only if you are looking for caffeine. I get get all I want about caffeine from Wikipedia without paying 6.20 USD. Ah well, So, off to chemspider which is free. The search for diazonamide A reveals: 10472888 is shown at full size. (There are two more structures but both are equally unreadable). Note that the atom counts of the structures are inconsistent – the actual composition – I think – is that of 4591072. I try to zoom the formula and get a featureless gray square on both IE and Firefox. So I try Jmol (shown right). Now the molecules are three-dimensional but the coordinates in chemspider are those of the 2-D diagram. Personally I regard this as extremely misleading and would NEVER use Jmol for 2D diagrams, but I shan’t pursue this here. So I still don’t know what the molecule is. Where else? Perhaps I can use some more abstracts… And the fourth one on Pubmed hits gold. It’s from PNAS: and it’s FREE!!!!! so we find the structure: Truth at last. (For non-chemists the exact width of the lines matters, and the pixellation makes it very difficult to be sure. But I’m sure it’s correct. And now what you have been waiting for – Totally Synthetic’s structure: I think you’ll agree that the blogosphere is starting to emerge as a serious place to look for chemistry.  pasted directly from the Pubchem site, suggesting we can create an image library for chemical structures
PMR: Thank you Dave (Dave – as I have already mentioned – has been very supportive of new approaches to chemical informatics). AuthorChoice is a “hybrid Open Access” product produced by the ACS. “Hybrid” only applies to publishers (and sometime specific journals) that are primarily closed (Toll Access, pay-to-read) but where authors may purchase “Open Access” for their specific article. (Many OA publishers require all authors to pay to publish). Every publisher has a different name for their hybrid products and almost all of them offer different rights and restrictions. As I have said before, the quality of delivery of hybrid Open Access (and related products) is often poor. They are not well labelled, the navigation is poor, and the rights – if any – are often vague and contradictory. Hybrid offerings (as with the ACS) often still require the author to transfer copyright and do not allow full re-use of the article. I am not (here) criticizing hybrid OA per se (though personally I think it is a distraction and is likely to be ineffective in every way). Nor am I concerned (here) with the price level, though I personally would not believe that I get good value from many publishers (as I require full permissions, including author retention of copyright). What concerned me here was that the reader (and thereby the author) was not getting what they were entitled to. It is very clear that the OA community MUST insist on clear labelling and must police the practice. Many “OA” publishers are creating unacceptable offerings – either deliberately or probably through laziness and lack of commitment (I call this systemic failure of the industry). I had not intended to embark on any campaign and I am glad to see that others at Berlin5 are interesting in putting in place more formal mechanisms. For example we need a system of labels – but that’s not my story to tell. I don’t actually like attacking people (institutions are slightly different). Sometimes my role appears to be that of a gadfly. I didn’t know why people use this particular analogy so looked it up in WP and found Gadfly
Thanks for pointing out the problem in accessing ACS AuthorChoice articles. This was a technical glitch which is in the process of being fixed. Please be assured that it is our intention that AuthorChoice material is available without charge from the time it is posted on the web. We believe the solutions we’re putting into place will prevent this access problem from happening again.
********************************* David Martinsen American Chemical Society 1155 16th St. NW Washington, DC 20036 d_martinsen AT work-it-out
“Gadfly” is a term for people who upset the status quo by posing upsetting or novel questions, or attempt to stimulate innovation by proving an irritant. The term “gadfly” was used by Plato to describe Socrates‘ relationship of uncomfortable goad to the Athenian political scene, which he compared to a slow and dimwitted horse. It was used earlier by the prophet Jeremiah in chapter 46 of his book. The term has been used to describe many politicians and social commentators. During his defense when on trial for his life, Socrates, according to Plato’s writings, pointed out that dissent, like the tiny (relative to the size of a horse) gadfly, was easy to swat, but the cost to society of silencing individuals who were irritating could be very high. “If you kill a man like me, you will injure yourselves more than you will injure me,” because his role was that of a gadfly, “to sting people and whip them into a fury, all in the service of truth.”PMR: I’m delighted to know the etymology (or rather the usage). And Perhaps that is sometimes why I like the Socratic approach – posing questions which require definite answers rather than generalities. But, ahem, although it grows here I really don’t like hemlock.
Jakob Says: You wrote: “More, because I have added this link to my blog, Jakoblog will get notified.” This is true and it may happen that the author will come and see what you have written and even leave a comment – how often do you experience this with publications on paper? Conventional scholarly publications are so old-fashioned, slow, impractical, and inefficient. If you do your research for the progress of knowledge (and not only for your career) then you should better tag your notes at a social tagging/bookmarking service, write your thoughts in your blog, archive your summary-paper at a publication server, provide your data and sourcecode in data and software libraries, discuss you opinion in mailing lists, compile your research into other people’s work in wikis etc…. this is science in the 21st century!
… and …
From the librarian’s point of view I can tell you that archiving data is probably even more complex then it seems to be. From the computer scientist’s point of view I can tell you that Semantic Web will enlight us easily. From the Open Content movement’s point of view I can tell you that you should just license the data and make it available and usable for anyone – like you said: first make sure THAT the data CAN be used.PMR: Thanks Jakob. There is a growing number of people like you – we need to link them to generate critical mass. In chemistry we have created the Blue Obelisk community and we have pooled our resources and efforts. This could be done for content systems – informally as well as through institutions – an example is our collaboration with Peter Sefton on authoring tools.