Open Science, Closed Source and Microsoft

Glyn Moody and I share many of the same views but here is an area where we differ. His position is straightforward and logically defensible – mine is less clear but – in my own mind – tenable. Currently I am funded by Microsoft – is this compatible with being an activist for Open Knowledge? I reproduce Glyn’s post in full (see also one by Bill Hooker which shares some, but not all views.)

Open Science, Closed Source

One of the things that disappoints me is the lack of understanding of what’s at stake with open source among some of the other open communities. For example, some in the world of open science seem to think it’s OK to work with Microsoft, provided it furthers their own specific agenda [PMR emphasis]. Here’s a case in point:

John Wilbanks, VP of Science for Creative Commons, gave O’Reilly Media an exclusive sneak preview of a joint announcement that they will be making with Microsoft later today at the O’Reilly Emerging Technology Conference.
According to John, who talked to us shortly after getting off a plane from Brazil, Microsoft will be releasing, under an open source license, Word plugins that will allow scientists to mark up their papers with scientific entities directly.
“The scientific culture is not one, traditionally, where you have hyperlinks,” Wilbanks told us. “You have citations. And you don’t want to do cross-references of hyperlinks between papers, you want to do links directly to the gene sequences in the database.”
Wilbanks says that Science Commons has been working for several years to build up a library of these scientific entities. “What Microsoft has done is to build plugins that work essentially the same way you’d use spell check, they can check for the words in their paper that have hyperlinks in our open knowledge base, and then mark them up.”

That might sound fine – after all, the plugins are open source, right? But no. Here’s the problem:

Wilbanks said that Word is, in his experience, the dominant publishing system used in the life sciences, although tools like LaTex are popular in disciplines such as chemistry or physics. And even then, he says it’s probably the place that most people prepare drafts. “almost everything I see when I have to peer review is in a .doc format.”

In other words, he doesn’t see any problem with perpetuating Microsoft’s stranglehold on word processing. But it has consistently abused that monopoly by using its proprietary data formats to lock out commercial rivals or free alternatives, and push through pseudo-standards like OOXML that aren’t truly open, and which have essentially destroyed ISO as a legitimate forum for open standards.
Working with Microsoft on open source plugins might seem innocent enough, but it’s really just entrenching Microsoft’s power yet further in the scientific community, weakening openness in general – which means, ultimately, undermining all the other excellent work of the Science Commons.
It would have been far better to work with to produce similar plugins, making the free office suite even more attractive, and thus giving scientists yet another reason to go truly open, with all the attendant benefits, rather than making do with a hobbled, faux-openness, as here.

PMR: First the facts. My group are funded by Microsoft to work on Chem4Word – an Add-in for chemistry in Word2007 – and also as part of the OREChem communal project organizing semantic chemistry under ORE. I personally get no financial reward other than meetings in Seattle. The Add-in will be Open Source.

To state my position on the various potential concerns…

  • Microsoft is “evil”. I can understand this view – especially during the Hallowee’n document era. There are many “evil” companies – they can be found in publishing (?PRISM), pharmaceuticals (where I used to work) Constant Gardener) , petrotechnical, scientific software, etc. Large companies often/always? adopt questionable practices. [I differentiate complete commercial sectors – such as tobacco, defence and betting where I would have moral issues] . The difficulty here is that there is no clear line between an evil company and an acceptable one
  • Monopolies are unacceptable. I have some sympathy with this view. It’s not quite the same as above, as the categorisation is simpler. By this measure all monopolies should be opposed, regardless of whether they are evil or not – this includes Google – for example.
  • Capitalism is evil and should be opposed. This is logically defensible but I don’t hold this view myself.
  • Microsoft has abused the standards process. I’m prepared to believe this, just as I am prepared to believe that publishers hired Dezenhall to rubbish Open Access.

So why am I working with funding from Microsoft? The project(s) are aimed at developing semantic documents – just as John Wilbanks is doing. Microsoft Research believes in semantic documents for research and it is encouraging to find a supporter when the chemistry sector is still in the dark ages. At the end of this we shall have made significant progress towards linked, semantic science.

Is this perpetuating the monopoly? The monopoly exists and nowhere more than in in/organic chemistry where nearly all chemists use Word. We have taken the view that we will work with what scientists actually use, not what we would like them to use. The only current alternative is to avoid working in this field – chemists will not use Open Office.

For the record we also work with Open Office – primarily through Peter Sefton. That at least is encouraging alternatives. But – whether you like it or – Open Office has too many rough edges to make it easy for those who are not early adopters.

However monopolies do not last forever. I’ve lived hrough “Nobody was sacked for buying IBM”, “… DEC”. Substitute  “… Google,  Microsoft, etc.” I’m also optimistic that the hegemony of the commercial publishers will also crash.  Has the pressure of anti-trust, etc. has caused re-orientation in Microsoft? Are projects such as ours helping to promote change? Or simply adding legitimacy?

What I can say is that the individual people I work with in Microsoft Research are genuinely looking to engage with the scholarly community and to help promote Openness. What I have little insight into is the soul of the large corporate itself. I’ve worked for Glaxo, and am sponsored by Unilever and Microsoft (as well as having been funded by IBM). One can never be easy with large companies, but at the moment I can draw the lines.

  1. Peter Sefton says:

    Peter, I don’t think it’s fair to say that OOo has too many rough edges for all but earl adopters, there are frustrations, but I’d say they’re about equal to the frustrations with Word. We have been giving it to ordinary users for ages with no problems. Have you tried it lately?
    I will have to talk about this at greater length but I think the issue is not working with Microsoft it’s working in an interoperable way. The plugins coming out of MS Research now might be made by well meaning people but unless they encode their results in something that can interop with other word processors (the main one is OOo Writer) then the effect is to prolong the monopoly. There is a not su subtle trick going on here – MS are opening up the word processing format with one hand while building addons like the Ontology stuff and the NLM work which depend on Word 2007 to work with the other hand. I have raised this with Jim Downing and I hope you can get a real interop on Chem4Word.

  2. Glyn Moody says:

    I quite agree with Peter Sefton: if the results can be freely used by the open source software community, there’s no problem. I’m not against scientists working with Microsoft per se, just against the results of that work being locked into its products. If scientists can help Microsoft see the light from the *inside*, through collaborations, so much the better.

  3. Jim Downing says:

    Peter (S): The extent to which ‘real’ interop (in the sense you’re probably intending) can exist will depend almost entirely on whether OOo decide keep the CML files in the package when they convert to ODF (which has a similar mechanism AFAIK). At the moment I’m aiming for the renditions to degrade gracefully and for the chemical information to persist, even if the relationships are lost. Not ideal, but not catastrophic.
    Yeah, Writer is usable, but I’ll happily buy a drink for anyone who word processes as a major part of their job and uses it for preference!

  4. pm286 says:

    1. Hi Peter – I have commented on this in a later post. I think JimD puts it fairly clearly, but it’s about the complete semantic document.
    2. Glyn. I’ve addressed this later. We are trying everything to avoid lockin – and so are our direct collaborators and funders in MS. They’ve moved a long way toward Openness. They want us to develop this in a wider community. More ideas as we progress.

  5. Rich Apodaca says:

    Peter, IMHO being funded by Microsoft is neither inconsistent with advocating Open Data nor with advocating Open Source. Microsoft isn’t evil – it’s just increasingly irrelevant.
    The marketplace is currently dealing Microsoft what it deserves. Its customers now have choices like never before and they’re increasingly saying “no” to overdesigned products, planned obsolescence, and the general arrogance and disinterest in customers that monopoly breeds.
    One of the things Microsoft’s former customers are turning to is a Web-centric way of working. Google docs is one example, but companies like 37signals, Firewheel Design, and a host of others are showing that the number of situations in which a desktop application is necessary is smaller than many would have predicted. Many of them charge for their services and a good number are profitable. Nothing wrong with that.
    As long as Microsoft’s money is there, I’d take it without the slightest reservation. But I’d also try to make sure that what I’m being paid to do had some relevance to people doing their work on the Web.

  6. Glyn Moody says:

    @Jim: I write everything – long features and numerous blog posts – on Writer. Unlike 1.0, which I found pretty unusable, version 2 onwards has been great: fast, stable, with all the features I need as a writer/journalist/blogger.

  7. Jim Downing says:

    @Glyn: Next time we’re in the same place, we’ll have to meet up and I’ll buy you that drink then!

