petermr's blog

A Scientist and the Web

 

Scholarly HTML: hackfest and visit of Peter Sefton and Martin Fenner

#scholarlyhtml @ptsefton

We’re gearing up for our scholarly hackfest (March 12-13) – for details see http://www-pmr.ch.cam.ac.uk/wiki/Scholarly_HTML which will be updated and which includes a registration process. This is because it’s over a weekend and we need to know who is in the department (for safety, etc.) This all worked fine in our first hackfest.

As it’s a hackfest the details are fluid but the known facts are:

Peter Sefton is here from midweek next week (9th March) to about 20th March
Martin Fenner is here over March 12/13 weekend

The general plan is to CREATE something during the time that PT is here. PT runs a world class team in University of Southern Queensland which has created a proven Open toolset based on WordPress for high quality scholarly documents (e.g. course materials, papers, theses). Martin has likewise pioneered many plugins for WordPress.

We shall invite Peter and Martin to give presentations (but this will need to be on a weekday)

The theme is Scholarly HTML with particular emphasis on data publication.  It is to give authors the freedom to author as they wish, not as they are constrained but the recipient. A consequence is that all data should be semantic (i.e. understandable by machine). This means that bitmaps such as PNG should be replaced or augmented by – say – SVG or HTML5. Much of the impetus for the meeting came from “Beyond the PDF” run by Phil Bourne and Anita de Waard.

In general we would like to be able to publish:

  • Semantic (mainly rectangular) tables where columns have defined semantics
  • Semantic graphs where axes are semantic and points, lines, bars etc are first-class objects
  • Maths (MathML)
  • Semantic bibliography (technically solved, but we’d like to include online OPEN resources (e.g. from Open Bibliography)
  • Scalable diagrams (probably SVG)
  • Chemistry/crystallography as CML

There will be many ideas but as a focus we have come up with a unifying project. After discussion with Simon Hodson (JISC) and Brian McMahon (IUCr) we plan to implement the following idea in our JISCXYZ project and to start this during the hackfest. (Simon and Brian hope to be present for some of the time).

A data-journal for crystallography

Every week Crystaleye aggregates (automatically) a few hundred structures and creates fully semantic CML. These are currently published as HTML pages with embedded CML and PNGs (http://wwmm.ch.cam.ac.uk/crystaleye) . A typical page (there are ca 250,000) is http://wwmm.ch.cam.ac.uk/crystaleye/summary/acta/c/2008/01-00/data/av3113/av3113sup1_I/av3113sup1_I.cif.summary.html (you can twiddle the molecule and create the unit cell by clicking). We wish to create a “data publication” from this material.

The proposed data journal will automatically select ca 10 interesting structures per week and publish these as a Scholarly HTML blog. The hackfest will educate us to the best ways of representing these as Scholarly HTML and allowing the best modes of presentation. Because we shall be using a blog readers can comment on these structures using the blog mechanism and also add their own ideas about interesting structures that we have not included. In this way we hope to build up a sense of publication and comment.

There is also the possibility for readers to submit their own structures which will be automatically validated during the submission process. We’ll work very closely with the IUCr during this. We can add to the interest by having ranking tables for authors or contributors and having various “records” such as largest structure.

Assuming that the data journal works technically we will work with BrianM and colleagues to see if the format has value for IUCr.

So – if you are interested, register. We can’t pay travel or accommodation but will provide geek food during the weekend. If you can only come during the week let us know. I will be at JISC on 16th

 

Leave a Reply