We shall lose the general election.

This blog is dictated

On Thursday I shall vote in the general election. I have voted in every election that I have been able to and I think this is the most important of all. Because the future of democracy is in the balance. It is easy to show that whichever party “wins” the election it will have a small minority of the popular vote. It is because all aspects of our life, including science, in this country are affected by this election that I think it is a suitable topic for the blog.

Science has already been seriously affected by the misconduct and the miscalculation of those that we have elected to run the country. The scandal of the MP’s expenses has been the most obvious but more seriously I hold them guilty of not having taken the basic steps necessary to run any organisation let alone the country. It is clear to anyone educated in science and indeed in any practicalities in the real world that the continued attempts to create wealth out of nothing could not possibly continue. The only reason, I think, that people did not raise an alarm is that they trusted the government to be sensible. We must recall that the country is not just governed by politicians but a very large number of professional civil servants whose job it is to foresee the immediate future and to advise politicians. Likewise it is the responsibility of politicians to heed the warnings from their civil servants.

There were voices, such as Will Hutton and satirists, who pointed out that the situation could not last. The politicians did not listen and our country has suffered severe economic setback that we’re only just starting to appreciate. It is not the politicians who suffer, it is the ordinary people including those who had hoped to be in positions of service to the country and now find that funding is reduced; pensions are cut and the value of savings as drastically fallen. And yet the absurdity will continue; the bankers will still move money around in ways that violates all normal conservation laws. It seems clear that the main way that they generate money is by removing it from us. Wealth generation is through making things and providing useful services, and this is one of the primary effects of investing in science. Yet even before we have had the election the government has savagely slashed the education and research projects in this country.

As is clear to everyone all parties vie for the central ground. There is nothing to choose between them. None have produced a radical programme of reform. So what will my vote show?

It is highly likely that the result of the election will be to show that no party has a mandate and that the country has no confidence in the political process. The test for the country will then be how this problem is resolved. The absurdity of a party having the lowest percentage of the vote and the largest number of seats can surely not be allowed to continue. The British people are not highly demonstrative but I suspect that the ill feeling with an unrepresentative government will be sufficient to make the country ungovernable by the elected representatives.

This is a highly dangerous situation. The country has no written constitution and problems such as this are typically resolved in the Clubs of London and other institutions of the establishment. It’s the sort of situation where dictators with charisma and a message can rise and although this country has not had a typical dictator for several hundred years there is nothing to say it cannot happen again. Conventional parliament has shown itself to be essentially useless and toothless in the democratic process. It happens that one of our candidates worked in our group in the Unilever centre, and I am sure that he will make a good attempt to to be a committed and responsible MP. But the Blair government has shown that a presidential style can overall the democratic process. A single person with a sufficiently monomanic vision can take the country to war, despite two million demonstrators on the streets. I see no sign that this will change and I fear for our democracy.

The only way of hope has been in the development of greater communication to and from the people through the electronic medium. Examples of this includes the democratic tools developed by My Society including They Work For You, What Do They Know and similar sites. Across the world web democracy continues to promote the voice of the people and to give almost instant messages to the politicians. Europe is particularly concerned about the neutrality of the net and there are groups in all countries which actively lobby for the freedom of expression as a universal human right. And yet our elected representatives do not listen.

How will this be resolved? My worry is that it will not and that we will slide in to a quasi totalitarian government which makes its own decisions without effective responsibility. During this process there will be outrage in disconnected outbreaks but unsufficiently coordinated that they will not change this apathetic and spineless system.

How am I going to vote? All I know at the moment is that I will dress up for the occasion as I always do. If there were a candidate for the Official Monster Raving Loony Party I would seriously consider giving them my vote. Since I believe it is irresponsible to vote for any of the major parties (since there is no possibility that they will correct the process of government) I think it is more productive to show that we actively must change the system rather than to vote for a mainstream party. There will be a mess anyway after the election and we might as well show that we have deliberately voted for a mess rather than to create one by inaction. How the mess will be sorted out I wish I knew and I seriously fear for the process.

This blog carries little or no weight in the world in general. However I have raised both the values of word democracy and the problems of the current political process before. I have campaigned for word neutrality and I have promoted organs of democracy. I shall continue to do so. I hope that there is sufficient message from the British blogosphere that it becomes clear that there must be political action for change. We need a visionary, but not a dictator, to outline what this could be and lead us through the process. I doubt it will happen.

And the British people will have lost the general election.

The dictation is still not perfect and it’s harder to spot “typos” [what’s the name for a spoken typo?]

Posted in Uncategorized | 2 Comments

Time flies like an arrow; fruit flies like a banana. Or do they?

In my postprandial presentation at Churchill College I stressed the importance of natural language in Artificial Intelligence – and the more I think of this, the more important I believe natural language will be in communicating with computers. I am probably a weak adopter of Linguistic Realilty (http://en.wikipedia.org/wiki/Linguistic_relativity ) – I recently reread 1984 and its plausible ending of dumbing down human thought by imposed restricted language. Our current attempts to communicate chemistry to and from machines will certainly be very limited and even less expressive than Newspeak.
But any expressive language is ambiguous. The English language is particularly prone to ambiguity since most of its words are not inflected and there is no lexical clue in many cases the as to the part of speech. Many common verbs and nouns share the same or similar lexical forms, sometimes being homophones or homonyms and sometimes being capable of use in either nounal or verbal forms.
A particularly amusing example of this (http://en.wikipedia.org/wiki/Time_flies_like_an_arrow;_fruit_flies_like_a_banana ) is frequently quoted in introductions to linguistics and I used it to show that whatever claims I made of computers understanding chemistry there will be many cases where the spoken or written words will be ambiguous to humans. I will use this to introduce part of speech tagging (POS).

NN Time VBZ flies PRP like DT an NN arrow;

NN fruit NN flies VB like DT a NN banana

Here NN=noun; VB=verb; PRP=preposition; DT=determiner (article); it should be noted that there are many schemes of POS-tagging and we generally use the Brown tagger. The main point of this article is to try to uncover the source of these quotations, but first I will comment on how the tagging takes place.
The Brown tagger attempts to put a unique tag on each token (a token is normally a white space separated word). Words such as “the” and “a” are normally unambiguous and in this case have been tagged as DT. But the words “like” and “files” are homonyms (http://en.wikipedia.org/wiki/Homonym). Taken in isolation it is impossible to give them a POS tag. Because of this full natural language parsing can be extremely expensive computationally and often results in a large number of different possible parses. These can be collected into a tree bank and different interpretations can be selected on the basis of probability and usage. Here the problem arises in that we have to understand the meaning of “time” and “fruit fly”. If we did not know that there was such things as fruit flies the second sentence might be parsed in the same way as the first. Similarly if we believe there are creatures called time flies which eat arrows then the first sentence can be interpreted with a similar meaning to the second.
I believe I first heard this pair of sentences in about 1969 at the University of Stirling where Christopher Longuet-Higgins gave a public lecture on artificial intelligence. However when I wished to cite it in my presentation I turned to the net and found it was apparently a quotation from Groucho Marx and labelled it as such. After the presentation one of my colleagues, Ray Abrahams (http://www.chu.cam.ac.uk/~RGA1000/), thought it might have come from Noam Chomsky. We turned to the net and found that neither of us was completely sure. Today Ray has written an email with some extra information which I reproduce in full in case it helps to clear up the origin.
Ray writes:
I suspect we are both wrong about the ‘quotation’. It does not appear to be Chomsky’s (though it seems to have emerged under the influence of his work) and there is a statement on Wiki[pedia] as follows:

“Time flies like an arrow. Fruit flies like a banana ”

No known citation to Marx. First appears unattributed in mid-1960s logic/computing texts as an example of the difficulty of machine parsing of ambiguous statements. Google Books. The Yale Book of Quotations dates the attribution to Marx to a 9 July 1982 net.jokes post on Usenet”. (http://en.wikiquote.org/wiki/Groucho_Marx )

The main 60s text is […] by Anthony Oettinger on ‘The Uses of Computers in Science’. It was published in a special issue of Scientific American in 1966 (Vol 215 No 3 p 161-72) on computers. […] It is also in a later bound volume of papers from Scientific American

Oettinger seems to have worked with Susumo Kuno on this and other language problems (above 166-7), and it seems to be their results, offering several different computer readings of the line, that are quoted by Steven Pinker in the Language Instinct (p.209), though he mentions neither of them.

Google books under ‘fruit flies like a banana’ gives several texts where Oettinger’s paper is included or quoted.

I came across Oettinger’s stuff originally through our retired fellow, John Barnes who cited it in an article called Time Flies Like an Arrow in the journal Man, New Series, Vol.6, No.4 p.537-52. There is a copy of this in the college library and it is on Jstor. It contains a nice verse apparently written by one Edison B. Schroeder. I could not find the text of it where Barnes references it (i.e. in the same 1966 SA volume).

Now, thin fruit flies like thunderstorms

And thin farm boys like farm girls narrow;

And tax firm men like fat tax forms –

But time flies like an arrow.

When tax forms tax all firm men’s souls,

while farm girls slim their boyfriends’ flanks;

That’s when the murd’rous thunder rolls –

and thins the fruit flies ranks.

Like tossed bananas in the skies,

The thin fruit flies like common yarrow;

Then’s the time to time the time flies –

Like the time flies like an arrow.

[PMR] My simplistic citing is an example of how attributions can multiply without formal checking; however Wikipedia always progresses forward (i.e. improvement) and will gradually become the best source. So perhaps this episode may help the maintainers of this page.
And many thanks to Ray.

Posted in Uncategorized | 3 Comments

The tragedy of Macie SVG – the deliberate indifference of Microsoft and Adobe to web graphics

A major part of my scientific work appears to have been condemned to rot in the dungeons of the Web because of the failure of large vested interests to try to work together.

5 years ago I created a tool for animating enzyme reactions ( a computer generated movie of how molecules are transformed in an active site). [co-workers Janet Thornton, John Mitchell, Gemma Holliday and others. (http://www.ebi.ac.uk/thornton-srv/databases/MACiE/).

It was well ahead of its time. For 1000 reactions it generated the coordinates and made a smooth transition between them. I believe that if it had been taken up by the biological community we would now have a better understanding of this subdiscipline. I had hoped to show it at a talk I’m giving later this month in Biochemistry in Cambridge. So I went to the EBI site and found:

The Animations [in MACiE]

The animations are automatically generated from the raw CML as scalable vector graphics (SVG). Unfortunately, due to circumstances beyond our control, these currently only work well on Internet Explorer and Avant with the Adobe SVG plugin, which is available from adobe.com. Whilst Firefox and some other browser do support SVGs, the do not seem to currently support the animation elements. This seems to be a case of Adobe removing support for browsers other than Internet Explorer from their software. We will continue to try to resolve this issue.

We are currently aware of problems with the following browsers:

  • Firefox 1.0.7 and 1.5
  • Opera 8.53
  • Netscape 8.1
  • Konqueror 3.1.3

My work can no longer be shown to the scientific community because Microsoft and Adobe are not prepared to adopt W3C recommendations. I hold them largely responsible for holding back community science based on graphics. Perhaps by as much as 5 years or more.

Because there is no simple enduring way of creating semantic graphics that we can rely on. And non-semantic graphics (mindlessly adopted by publishers) destroys information.

I have been through about many 5-year cycles of graphics which look like the following:

  • Stop whatever you are doing
  • Pick a new emerging graphics technology
  • Try to learn it
  • Try to install something that works on your machine
  • Try to create things that work on other people’s machines
  • Start telling the world how wonderful it is
  • Find that the technology disintegrates or is torpedoed.
  • Go to 1.

For the record this includes Calcomp80, Tek4100, GKS, Phigs, GL, Tk, SVG, WPF, etc. Although some of these are still around you have to be a geek to get them installed and used on a given machine. (Some might work on a particular platform but try, say, getting WPF running on Linux – or in Firefox).

 

I saw SVG as the great white hope in 1997/8. I hailed it as the universal graphics platform for the web. I saw it as the first killer app for XML-over –the-web.

 

In case you don’t know, SVG is Scalable Vector Graphics. Read about it at http://en.wikipedia.org/wiki/Scalable_Vector_Graphics. And agree that it will do what the web needs for most graphics. It’s a W3C Recommendation (i.e. quasi-standard). It’s got the additional advantage that it’s one of their recommendations that:

  • Is well-thought out
  • Works
  • Can be understood by mortals
  • Is widely implemented

In the original phase it was enthusiastically supported by Adobe. They built a great viewer. A really great viewer. It’s not easy to build an SVG viewer. And they made it free. You could install it on Internet Explorer.

 

But not Open.

 

It didn’t run on Firefox. But we could hope. After all Adobe was pushing it.

 

However Microsoft wasn’t. They had other graphics languages (sic). Microsoft often has more-than-one-way-to-do-it. We have encountered this problem. In Chem4Word we had a choice of graphics. Visio, Silverlight, some-VL-that-I-can’t-remember, WPF/XAML, etc.

 

But not SVG.

 

Because SVG doesn’t work natively in IE (or Office). It is tolerated in that it can be included in (say) Powerpoint without crashing it. But it doesn’t use the semantics.

 

Why didn’t MS support SVG when it was launched in 1998? I don’t know. The marvel is perhaps that they did support XML. And we have Jean Paoli of Microsoft to thank for that. Together with Jon Bosak (Sun) they got joint sponsorship from the two companies to sponsor XML. And given that this was in some of the darkest days of Microsoft hegemony and monopoly we should thank Jean for this. Think what would have happened in XML had split into XML and MSML.

 

I have called this “The tragedy of SVG” because Adobe slaughtered its child, with Microsoft callously looking on. Those two bear the public responsibility of having destroyed 10 years of progress in layout graphics. Ten years of mindless adoption of PDF and closed Word binary documents.

 

We are reduced to scraping dead PDFs to try to extract something slightly better than rubbish. The companies fully understand the issue of interoperability and they choose to compete instead.

 

Science suffers.

 

Nobody likes seeing their work destroyed. Choosing a W3C recommendation 5 years after its adoption seemed a reasonable way of going forward seemed a responsible thing to do.

 

I shan’t give up. There is a small glimmer of hope. Not the microcheer for Microsoft announcing that 12 years after SVG’s launch it might include SVG at some time in a future IE hoped launched this decade:

On January 5, 2010, a senior manager of the Internet Explorer team at Microsoft announced on his official blog that Microsoft had just requested to join the SVG Working Group of the W3C in order to “take part in ensuring future versions of the SVG spec will meet the needs of developers and end users,” although no plans for SVG support in Internet Explorer were mentioned at that time.[61] During Microsoft’s MIX 2010 developer conference, it was announced that IE9 would support SVG 1.1.[62]

But there is more hope from the Open community which I’ll mention later.

 

I have copies on the material and it may even be in our Institutional repository. But if the EBI no longer expose it there is no point in preserving something that has been asphyxiated by inaction.

 

But I shan’t let myself be depressed.

 

Posted in Uncategorized | 6 Comments

Scientific information is beautiful

Dictated to Arcturus (my computer)

I had a wonderful comment on one of my recent posts

book publishers says:

May 2, 2010 at 3:41 am  (Edit)

What a beautiful map of living systems, reminds me of a maze – thanks for the links

 

I went to their webpage and found an example of a medieval illustrated manuscript. Since it is probably copyright I am including a similar example from Wikipedia.

Documents such as this are held in awe by our culture. We marvel at the effort that the monks put in. People will visit the British library and similar museums simply to wonder. But others will shake their head and say how much more efficiently it could be done using the printing press.

What will people in a hundred years’ time think of the documents that we produce? Will they look at the typical PDFs produced by the scholarly publishers and marvel at their beauty. Or will they shake their heads at the futility of trying to continue the printed tradition in the electronic information? Will they ask “why did they not use machines to organize their information?” Why did they not try to make machine-readable information? I do not know what historians will say but I hope that some of them will point out that this is a tragic backwater where commercial and economic interests briefly held sway over the principle of making semantic information available to everyone.

Creating high quality non textual information is not easy to the biochemical pathway that I showed in the last post will have taken many years to produce. Here’s a transcluded example (I link but do not copy – do I break copyright? Does it matter?) Biochemical pathways
http://www.flickr.com/photos/ejain/367998451/

The collection of the scientific data will have taken millions of cumulative years but laying out the information in a way that humans can understand has also taken years – probably tens of years. This is because it is not immediately clear from the science what components are related to what. Some relationships bridge areas that are formally unrelated. (In essence a feedstock created for one purpose is used in a completely different context). But the problem is that the diagram above is too complicated for humans to understand completely.

Here is an example (http://en.wikipedia.org/wiki/Biochemical_pathways ) of the same information when the components are simplified. (It’s not laid out precisely with the same coordinates, but the major topological features can be recognised even if you don’t understand biochemistry. Look for the cycles and spirals.)

In the Wikipedia article all the subsystems are hyperlinked so that the reader (human or machine) can reach them. Hyperlinking is one of the major revolutions in communication (though we are still very poor at it, especially for machines) and chained in the dungeon of publishing when it comes to freedom to access and use.

But part of the skill is to create documents with a reduced amount of visual components that carry a as much message as possible. That is a real skill which is desperately needed in the machine age. It’s relatively easy to create huge networks. It’s extremely difficult to show the critical essence.

Could a machine ever have the skill to extract the essence from networks and visualize it?

I think the answer is “occasionally”. Doing it consistently is the essence of the challenge of the Semantic Web.

I have tried to dictate this but things got on top of me. I and Arcturus are learning.

Posted in Uncategorized | 1 Comment

Churchill College in the University of Cambridge

Last night I was invited to give an after dinner talk (or postprandial) at Churchill College (http://en.wikipedia.org/wiki/Churchill_College,_Cambridge). These occasions are relaxed and individual presentations of things related to the interests of the speaker. My presentation was called “Can machines understand science” and included a number of demonstrations. I’ll talk more about the substance later.

Here I would like to thank Churchill College very much for having elected me a senior research fellow and given me the opportunity to develop my ideas. Churchill was founded 50 years ago as the national monument to Sir Winston Churchill whose vision was of a 20th century institution concentrating on, but not exclusively, on science. It is impossible not to be overawed by its list of Nobel Laureates and world famous discoveries. But as importantly it has managed to meld the 800 years of tradition of the University with the need to constantly rethink its purpose in the modern world. Nowhere is this shown more clearly than in Francis Crick’s letter to Sir Winston proposing that the College should not build a chapel but rather a brothel (http://en.wikipedia.org/wiki/Churchill_College,_Cambridge#Chapel). This was serious – the Chapel was a resigning matter for Crick.

The College has a wide range of subjects other than science and the Oxbridge tradition of broadening one’s vision is as strong in Churchill as anywhere. So that when I was asked (or volunteered) to present a postprandial (the Latin tradition still influences the academic vocabulary) I was nervous both of the weight of past science and scholarship in general. It’s good to be nervous – I have been plotting the presentation for some weeks. But I wasn’t worried. The fellowship at Churchill must be among the friendliest and most relaxing anywhere. The government of the college, though formally governed by statutes and making tough decisions, is skilfully guided by the Masters (I have served under 2) so that difficult problems are solved in non-confrontational manners. So I knew I could explore ideas with a friendly (but valuably critical) hearing and that I would learn from my fellows.

It was also fortunate that we were visited by Alex Wade (our collaborator from Microsoft Research) and his partner Amy Martin. My presentation developed by analogy between understanding chemistry and John’s Searle’s ” Chinese room” metaphor for artificial intelligence. Completely by chance it turned out that Alex had studied under John Searle and so was able to bring a completely new and valuable perspective to the occasion. Afterwards several of us discussed whether Searle’s metaphor was valuable and, as might be expected, there was more than one view. However it helped me as I realized that there were extra aspects that I had not thought of. (I shall write more about the chemical Chinese room later).

This post records my gratitude to Churchill College for the time that I have spent there and the freedom to develop my ideas.

Posted in Uncategorized | Leave a comment

American army declares war on Microsoft PowerPoint

According to an article today the US Army has declared war on PowerPoint:

http://www.metro.co.uk/home/823839-american-army-declares-war-on-microsoft-powerpoint

The problem is, apparently, this diagram. I reproduce it without permission but since it is presumably a work of the US government, albeit highly creative, it should be in the Public Domain.

 

Now I am opposed strongly to the current use of Powerpoint and include in my slides:

“Power corrupts; Powerpoint corrupts absolutely”.

I was proud when I had invented it and then found that Edward Tufte had already pre-empted me (http://www.edwardtufte.com/tufte/powerpoint ). His reason is quite different from mine, though I also agree with him. His concern is that Powerpoint reduces the creative input in communication to a set of bullet points – it corrupts human thought and dignity.

My concern is that it corrupts information – it reduces semantic graphics (assuming they were) to non-semantic binary. However now that Powerpoint exposes semantically (in XML) I have less quarrel with it from that point of view but have grown to adopt Tufte’s concern even more strongly – the linear flow of information. A slide show can only be shown in one direction – my own approach is to select whatever visual is needed at any stage. It’s not easy – and it doesn’t save easily – but it allows instant reaction to the audience and their needs.

I actually have no issue with the fact that the army diagram is in Powerpoint. The question is whether the information is useful. If it is, then it’s highly complex. A graphic is far more useful than reams of dense text (I suspect the textual equivalent of this picture would be at least 100 pages). I cannot tell whether it is actually a useful analysis of the concepts – I have little faith in any military analysis benefitting the world. (Like 2 million others I demonstrated against our involvement in Iraq and Afghanistan and events have justified our view.). But if it’s a useful set of concepts and if it’s useful to the generals then I suspect the graphic is useful.

Here is a biochemical example (taken from http://abeautifulwww.com/GeneVisualizations_E01C/roche1_3.jpg without permission but with thanks)

It’s complicated because it represents a living system and living systems are complicated. We desperately need a non-reductionist approach to this – or to delegate or thinking to computers.

Posted in Uncategorized | 1 Comment

Teaching my computer chemistry

This is a test of dictating a block in a very noisy coffee room. I have just bought a cost effective headset with a noise canceling microphone and so far every word that I have dictated has been faithfully rendered by the system.

This is impressive. It means that in our Amy project that we can probably rely on reasonable fidelity for converting the language that chemists speak into partially semantic natural language. So far I had had to make to corrections: common homophones such as to and two or four and for caused problems. However the system can learn from corrections and I expect and that it will make relatively few errors if I speak clearly. I am now very confident that it will be possible to give our fume hood simple instructions or queries that it will understand.

It is possible to introduce chemical names into the text; the system does a good job of recognizing them. Here is a list. Benzene, toluene, acetone, ethyl acetate, caffeine, testosterone, penicillin, malonic acid. It will also do functional groups. ethyl, methyl, propyl ,butyl, pentyl.

I had to correct some of those, but now they should be in the dictionary. Let’s try. Methyl, methyl, propyl, butyl, pentyl. I had to correct those. Let’s try again. Methyl, methyl, propyl, butyl, pentyl. I still had to make some corrections and I am worried that it confuses ethyl with methyl.

However with practice I expect it to learn. Dimethyl and he leaned her (should be dimethylaninline). Methyl benzyl eight (should be methyl benzoate). NA benzoate to (sodium benzoate) can it recognise sodium! Dichloromethane seen (should be dichlorobenzene). Chloro benzene. It’s got that one right. Dichloromethane. It’s got that one right. Chloro bromide on the same (Chlorobromomethane).

But it will be fun teaching it.

Posted in Uncategorized | 1 Comment

Test of publishing Chem4Word to blog

I have left my microphone so this is being typed.

Egon Willighagen asked whether it was possible to post documents created with Chem4Word to a blog. Having mastered the process of posting to a blog without chemistry (thanks Sam) I’m now trying chemistry

This is (benzene)

And this is coronene

Let’s see if I can post this. The result will not be semantic, but Chem4Word allows for this as the chemistry is displayed as static images (PNG). So they won’t do anything but they are a good representation of the chemistry.

 

 

 

Posted in Uncategorized | 1 Comment

Examples of Scientific Semantic Web wanted

I am giving a talk on Friday where I want to show the power and wow! of the Semantic Web applied to Science through an online example. I’d be keen on a DBPedia example (along the lines of http://wiki.dbpedia.org/OnlineAccess#h28-5 with “All soccer players, who played as goalkeeper for a club that has a stadium with more than 40.000 seats and who are born in a country with more than 10 million inhabitants”) but with a scientific content. But even this no longer works.
I have a general audience so I don’t want to talk through raw SPARQL (although I’ll hack it if necessary as long as there is an answer). I need something that can be demo’ed in a minute at most. In the medium term we will be able to hack this with molecules as we will be contributing Open RDF for molecules and structures.
The advantage of DBPedia (and to a lesser extent the whole LOD cloud) is that it has not been planned and I’d be grateful for examples that reflect this.
Any help will of course be acknowledged.

Posted in Uncategorized | 5 Comments

More on Chem4Word and OpenOffice

I have left my microphone so this is being typed.

I had expected – and am glad – that there would be debate on the release of Chem4Word under an Open Source licence. The latest contribution (http://techrights.org/2010/04/28/really-qualifying-as-foss/, (Dr. Roy Schestowitz)) which I quote in full (till the ruler)

Who can port Chem4Word to OpenOffice.org?

Summary: Chem4Word is an example of Free software which is trapped deep inside Microsoft’s proprietary cage and needs rescuing

From an academic and scientific point of view, Chem4Word’s developer does the right thing by becoming a Free software proponent and choosing the Apache licence for the project (not GPL, which would have been better). The only problem is that Chem4Word helps sell Microsoft Office, which means that any user of Chem4Word (even as Free software) will be pressured to buy a standards-hostile and closed-source office suite. Those who are close to this project are aware of the issue.

This is yet another example where Microsoft is using (as in exploiting) Free software to sell its proprietary software.

Supporting Microsoft software is bad for a variety of reasons, not just because it’s proprietary and standards-hostile. Here for example is a new explanation from Omar, who exemplifies what Microsoft is doing to developing countries where cost matters a lot.

But then the grief doesn’t end here, because the problem will seem even worse if you ponder the fact that most people, around the world, who use computers can barely afford to pay their monthly bills, and that all these people are using pirated software because:

* A) That’s the only software they’ve ever known.

And:

* B) They cannot afford to pay for the annual licensing fee of a genuine copy.

These people have been mass-hypnotized, they’ve been indoctrinated into believing that whatever MS gives them is right, and that MS software is the only software on Earth that actually works. Now, take under consideration that MS is a for-profit organization after all (Actually, MS is a for-nothing-but-profit organization, but ya know), and that sooner or later, MS will start collecting money in all ways possible.

Let us hope that Chem4Word gets extended (or forked) to support Free software further down the stack. It can support all major platforms if it gets ported to office suites such as OpenOffice.org.

“I would love to see all open source innovation happen on top of Windows.”

–Steve Ballmer, Microsoft CEO

I am not out of sympathy with much of this. I have made some of my position clear (http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=2233 ) and now add some more thoughts. For those who don’t know me and my group some background.

· I am a passionate and public supporter of Openness – I am on the advisory board of the Open Knowledge Foundation (http://www.okfn.org ), a prime mover in the Panton Principles for Open Data (http://www.pantonprinciples.org) and a founder of the Blue Obelisk Open software/data/standards (http://www.blueobelisk.org ) group in chemistry. I have been outspoken in this area on many occasions and have criticised certain non-Open Access publishers and opponents or obstructers of the free redistribution of scholarly data.

· My group and employer receives support from Microsoft for Chem4Word (I personally do not). I have made it clear to Microsoft that I shall speak my mind during the project and do not feel shackled. I am doing so now.

· I have been critical of Microsoft in the past (e.g. at the time of the Halloween document). I have entered this sponsorship with my eyes open.

· We spent a great deal of time and care drawing up the contract with Microsoft and this is reflected in the Open Source offering that we have now – jointly – delivered.

I am not against commercial companies. I used to work for Glaxo (now GSK) and our institute is sponsored by Unilever. I have lived through the era where IBM dominated the software/hardware market, to be replaced by Microsoft. I have seen many empires rise and fall and I am optimistic that monopolies in this area have the seeds of their own decline. Monopolies are generally bad and I worry about Google as much as Microsoft. I believe that the rise of competition checks on and exposure of Microsoft actions mean that there is less (apparent) monopoly. If Microsoft really were a monopoly I would probably be more concerned – it may still largely have the desktop but it doesn’t have a monopoly on the browser or the Net content.

Most software is closed – ICT is an exception – a shining and great exception, but unusual. Open software requires some form of incentive – a mixture of time and money in the first instance and largely money for sustainability. I wish it were otherwise and if this project can generate models in chemistry for sustainable F/OSS software I will be delighted. Bioinformatics is an exception (I may write elsewhere on this) but there is a great deal of public Open funding. In chemistry the normality is that software is closed, usually sub-standard (with regard to modern software engineering techniques), diminished by needless competitive duplication. There has been virtually no innovation over the last decade (integration and widget frosting, but no new science). We wish to change this – to create an infrastructure where the community can actually do new things rather than waiting for last-century companies to make minor modifications. We are getting there.

An important part of Chem4Word was to design a new approach to chemical information – one appropriate to this century using open standards (XML, RDF, REST, etc.) That’s happened and it’s all in the Open – code, data, specifications, etc. That’s available to the community whether or not people use Chem4Word within a Word environment. And to give Microsoft at least some credit they were early adopters and promoters of XML and Word uses XML rather than a proprietary language.

Porting to Open Office. I would be happy for this to go ahead. Ishould be regarded as an extension or port rather than a fork as forks are a last resort – this is not relevant here where the authors are supportive. It would help to reinforce the (Open) Chemical Markup Language (XML) we have developed and to develop the ideas of quality and conformance that are so badly lacking in commercial chemical software. But it needs support and it needs chemists. Open Source chemists are very rare – we struggle to overcome ideas such as “if it’s free it’s inferior”. Whereas in ICT lots of people are supported by their companies (implicitly or explicitly) to contribute to F/OSS, in chemistry no-one is. The F/OSS is largely ignored – though there are signs of this changing. The pharma companies are particularly culpable – we know of several who use F/OSS but give no acknowledge or encouragement.

If there is to be a port to OO it has to be done by chemists and thus will be effectively within the Blue Obelisk community as we know of relatively few other F/OSS chemists. As I’ve said if someone can make this happen we’d be delighted to help. But the barriers are relatively high – it carries no research reward (most F/OSS chemists are in academia or public research) and so is marginal time and to potentially detriment of career. And I cannot imagine it’s technically straightforward. There has been much in the Word work that has been very intricate and could not have happened without expert knowledge. So my main concerns are that it requires formal support and some very unusual individuals.

Posted in "virtual communities", Uncategorized | 2 Comments