Open Source increases the quality of science

There’s a vibrant discussion on my assertion that Chemical Open Source will win, see http://www.abhishek-tiwari.com/2009/06/chemical-open-source-for-free-how-far.html

Abhishek Tiwari argues:

Academia is already enjoying everything for free, for example most of chemoinformatics toolkits (JChem, OEChem), workflow solutions (PipeLine Pilot) and many others commercial software are freely available for academic uses. However at same time finding a commercial customer for open source software service in chemistry or biology is an arduous task. Pharmaceutical companies are maintaining huge BIO-IT departments, and some of them have created a back door to exploit the cheap and free software through their academic collaborations. Peter argue that anything offered to academia should be free and industry should be charged for that, does this include the academic labs with industry connections? I don’t understand why academia or anyone should be offered such a liberty especially when they are working for their industrial collaborations. Further, in my opinion Open Source and Free should not be considered as low or zero initial cost. Unlike academia hosted community project that involves cheap labor of PhD and Postdoctoral students, open source transformation for commercial vendors is never easy and they need to find a competitive way to survive. In either case it is totally justified if software producer happily charge you for their time and resources.

and there’s been many comments from Open Source and Blue Obelisk supporters such as Egon Willighagen , Deepak Singh, Rajarshi Guha. I’ll add mine here:

If the only argument were how to support the business for creating standard software in chemistry Abhishek has a reasonable case. For example a tool to manage the departmental payroll needs to be competent and supported and it’s perfectly reasonable that someone should be paid for it. (Interestingly even such tools are increasingly being built on Open Source but I’m not using that argument). No, the reasons are specific to science:

  • closed source can produce Bad Science. I remember as an undergraduate student, running crystallographic calculations, funding a bug in the program. It generated the wrong numbers . If I had used the results I would have created bad science. So I had to read the source code (effectively machine code) and explain the bug to the author. This was then corrected and other users could not make the same mistake. In contrast if you cannot read the source there can be no guarantee that the science is not corrupted. It’s worse when you are threatened by legal action for reporting bugs

  • closed source stifles innovation. Science builds on other science. When A reports a finding in the literature, B can build on it without permission. However if A creates a computer program and closes it, B cannot integrate it into their software. Suppose A calculates a molecular property and B wishes to develop machine-learning software to understand its significance, B is forbidden to do so. So B either has to wait till A creates a machine-learning algorithm B’, or to duplicate the property calculation A’. This leads to a plethora of clones. Every software house creates equivalent components of unknown quality. Unknown because they are closed, and because there is no way of any independent assessment of their quality. Result: an anticommmons where no one develops anything new.

  • Inflated claims are made about quality. The primary motive of companies is to make money. Nothing wrong about that but when your products look the same as your competitors there is a natural tendency to inflate yours over theirs. A’s fingerprints are better than B’s . This is a meaningless scientific statement as it is untestable. Assessments are by conversations in bars, not proper metrics.

  • Funded science is stifled. If I want to test a new hypothesis in chemoinformatics (if the subject still has a standing as a science) I have to use commercial tools. I cannot do repeatable science (for the reasons above) and I cannot build on them. So the only possibility is to write your own. That is very difficult. You don’t get publications for duplicating existing software, so you have to do it on a shoestring and because you believe in the cause. You have to find bits of funding here, volunteers there. The saving grace is that the Open Source codes collaborate, not compete. For example I wrote my own molecular viewer because there wasn’t a good one in Java. I spent a lot of time with a volunteer who used Java3D (groan). Then I decided that rather than write my own I would use Jmol. I would give up the glory of writing a viewer for the chance to develop Chemical Markup Language. If I had been in a company I would have put more coders on the viewer to try to beat the competition. But, in all of this, you don’t get funding.

But now we are beating the closed source components. When that happens we can return to doing science properly. We can develop the next generation of real chemoinformatics. I want to build intelligent chemistry in silico. The Chemist’s Amanuensis as we called Sciborg. I am nearly ready to start.

Because I know that I can leave much of the work to collaborators. And play my part in bring ing the scientific method back to chemoinformatics.

This entry was posted in Uncategorized. Bookmark the permalink.

2 Responses to Open Source increases the quality of science

  1. Pingback: I’m A Top Influencer For The Open University! (Or Am I?) « UK Web Focus

  2. Pingback: Wolfram|Alpha’s Terms and Conditions « UK Web Focus

Leave a Reply

Your email address will not be published. Required fields are marked *