Chemical MIME and the role of the IETF

I’ve just described Chemical MIME – not in great detail, more to illustrate a highly virulent meme. Chemical MIME is what Fowler would call a “sturdy indefensible”. It breaks the rules, but it is used, it works and it upsets few except pedants.

Until now. Read on, even if you are not a chemist, because it’s a general problem in modern informatics. And we need your help. I don’t know how to solve it.

Egon has pointed out a problem. It’s hit the KDE bug list (https://bugs.kde.org/show_bug.cgi?id=235563) . It’s a software bug, not a chemical bug:

 
 

Top of Form


Bug 235563 – invalid MIME type in /usr/share/applications/kde4/kalzium.desktop

 

Summary:

invalid MIME type in /usr/share/applications/kde4/kalzium.desktop

 

Product:

kalzium

Component:

general

Status:

RESOLVED

Resolution:

FIXED

Target:

Version:

unspecified

Priority:

NOR

Severity:

normal

Votes:

20

Version Fixed In:

 

Description From Laurent Bonnaud 2010-04-27 19:19:39

Version: 2.3.80 (using 4.4.2 (KDE 4.4.2), Kubuntu packages)

Compiler: cc

OS: Linux (i686) release 2.6.32-21-generic-pae

 

Here is the problem:

 

# update-desktop-database

[…]

Error in file “/usr/share/applications/kde4/kalzium.desktop”: “chemical/x-cml”

is an invalid MIME type (“chemical” is an unregistered media type)

 

What does this mean?

 

It means that a server has labelled a file with the MIME type (Content-Type) as chemical/x-cml

 

And that the application software has said that’s invalid.

 

And the application software is pedantically right.

 

So, best beloved…

 

In the early days of the Internet when ordinary people hacked servers and small furry penguins were small furry penguins, there was a brilliant idea to label content with its type. It was a brilliant idea and it still is a brilliant idea. It means that anyone in the world, on whatever platform, getting documents from whatever server could determine their type. All you had to do was add a simple text-string and the machines would recognise it.

 

So if you were transmitting a piece of text, you could label it “text/plain”. And an image might be labelled “image/png”. If you didn’t do this then you couldn’t know whether the bit stream was meant to be displayed as text (e.g. in the body of a mail message) or as an image accompanying the mail.

 

Mail? I thought we were on browsers?

 

No. This far predates the browser. Read http://en.wikipedia.org/wiki/MIME. This will give you an idea of the enormous contribution made to the Internet and the modern world by the great body of those dedicated to interoperability. The Internet is based on RFCs.

 

RFCs? Read http://en.wikipedia.org/wiki/Request_for_Comments. Without RFCs there would be no HTML. There would be no Google. No Facebook. No HTTP. No Wikipedia. No online pornography. There would be a bickering mass of companies fighting in a sludge of non-interoperability. Everyone would have their own server spec. Everyone have their own client spec. I remember that time. It was awful. A Holy Roman Empire of isolated barons.

 

One of the greatest achievement of the twentieth century was the Internet. And it succeeded because of the IETF. http://en.wikipedia.org/wiki/IETF. The IETF?

 

Their goal: “The goal of the IETF is to make the Internet work better.

 

Their motto: “Rough consensus and running code” . This is a great step towards the democratisation of the world through technology. It’s lead not only to a working system of physics and software but also as a touchstone for this century’s democracy. It’s exemplified in Wikipedia. It means listening to the other person’s point of view. And agreeing to come away with something that works.

 

In the IETF system, anyone can put forward a proposal. It’s called a draft. Here it is (https://datatracker.ietf.org/doc/draft-rzepa-chemical-mime-type/):

Document type:

Old Internet-Draft (Individual document)

Last updated:

1995-03-21

State:

Expired

Intended status:

Submission:

Individual

Responsible AD:

Bottom of Form

Document history

Date

Version

By

Text

1995-11-13

  

(System)

Draft expired

1995-03-21

01

(System)

New version available: draft-rzepa-chemical-mime-type-01 (diff from -00)

This Internet-Draft is no longer active. Unofficial copies of old Internet-Drafts can be found here:
http://tools.ietf.org/id/draft-rzepa-chemical-mime-type.

Abstract:
The purpose of this Internet Draft is to propose an update to Internet RFC 1521 to include a new primary content-type to be known as chemical. RFC 1521[1] describes mechanisms for specifying and describing the format of Internet Message Bodies via content-type/subtype pairs. We believe that chemical defines a fundamental type of content with unique presentational and processing aspects. We outline the typical expected uses of such a content type and propose a number of chemical sub-types. This document updates IETF Internet Draft draft-rzepa-chemical-mime-type-00.txt in which this specific proposal was made, incorporates suggestions received during the initial discussion period and indicates scientific support for and uptake of this proposal[2-7].

Authors:
Henry Rzepa <rzepa@ic.ac.uk>

P. Murray-Rust <pmr1716@ggr.co.uk>

B. J. Whitaker <benw@chemistry.leeds.ac.uk>

(Note: The e-mail addresses provided for the authors of this Internet-Draft may no longer be valid)

We put the idea into the IETF framework. It was a Draft, not an RFC. We had 6 months to convince the IETF. Henry went to a meeting. There was lots of discussion. One suggestion was that it could be used to send recreational drugs over the network. (Since I was working for Glaxo I wasn’t wild about being associated with this idea and it was not pursued in the body of the draft!).

 

The draft had a lot of supporters but it failed to get critical mass. It lapsed. Not enough rough consensus.

 

MIME is an excellent idea but its implementation does not allow easy extensibility. There’s a hardcoded set of type of the form foo/bar – seven major types and many secondary ones. Everyone knows that hierarchical classification systems break down sooner or sooner.

 

MIME’s extensibility was through “x-“. So suppose you had a new image format called penguin (designed to transmit pictures of penguins) , you might write “image/x-penguin”. At some stage in the future it might become accepted as a standard part of MIME.

 

So we started creating chemical/x-pdb, chemical/x-cml, etc. They are listed at http://www.ch.ic.ac.uk/chemime/ . The idea took off. There are probably hundreds of millions of documents labelled with chemical MIME. OK, the IETF didn’t want to know about them but they worked. MIME system did not appear to require know mime types.

 

And they have worked for 15 years.

 

Until, apparently, now. The software above checks primary MIME types. “chemical” isn’t one of them. So it throws an exception. It’s “right”.

 

But it’s not helpful.

 

What to do? I really don’t know. I can think of the following:

  • Go back to the IETF. Chance of success? 0.00000001
  • Get the chemical world to change to another MIME type (it’s possible that “x-chemical/pdb” would be allowed. But it might not). It would destroy hundreds of millions of working documents.
  • Fix the behaviour of KDE. Chance of success 0.00001
  • Ignore the problem
  • Try some awful kludgy workaround

 

How important is this problem? I don’t know. Is MIME becoming stricter? I doubt it. Are more systems validating it? ??

 

In so far as it is a problem it reflects the lack of community approach in chemistry. The chemical software industry is based largely on non-interoperability and lockin. All the approaches – and there aren’t many – come from outside either the software vendors or the pharma industry. Chemical MIME; CML; The Blue Obelisk; InChI. None of these have been industry-led. They succeed to the extent that they fill an essential need. Pharma ought to care – it doesn’t publicly show it. Software industry ought to care. It doesn’t until it’s forced to. I am not surprised by this – standards come when the industry is in a mess and they are essential, and we are at that stage now.

 

There will be a considerable number of new MIME types registered as a result of the Quixote project. We need to know the precise types of computational input and output. We do this without the active help of the companies producing the tools that create these files.

 

For Chemical MIME we will keep buggering on.

 

 

 

 

UPDATE:

Read the comments…

Bottom of Form

This entry was posted in Uncategorized. Bookmark the permalink.

One Response to Chemical MIME and the role of the IETF

  1. nina says:

    Most probably it is because my major was Computer Science, but I think IANA was right back in 1995 to reject top level “chemical/*”.
    Why there should be top level “chemical/*” and not top level “elementary_particle/*” or “gene_sequence/*” or “electrical_circuit/*” or anything else from any other domain?
    It seems physicists did better. “model/*” is accepted top level MIME type http://www.rfc-editor.org/rfc/rfc2077.txt .
    However, given the wide use of chemical/* for the last 15 years, how does one can estimate chance of success as 0.00000001 ?

Leave a Reply

Your email address will not be published. Required fields are marked *