Silicos contributes Commercial Open Source – thank you

It is very uncommon for commercial organizations in chemoinformatics to make any of their material Open Source. (Unlike the contributions of many IT companies – e.g. Eclipse, Netbeans, etc.) So I was very pleased to see an announcement of open Source  [BSD] chemoinformatics software on the Blue Obelisk list:

SiMath is Silicos’ open source library for the manipulation of data
matrices and the subsequent mathematical and statistical modeling.
SiMath tries to provide a simple C++ API that is data matrix centered
and that covers the model building procedure from data preprocessing
to training and evaluation of the model. The goal is to provide a
library that can be easily integrated into standalone applications.
The rationale of SiMath is not to invent the wheel again but to
integrate available open source packages and also newly implemented
algorithms into one comprehensive library. Several well established
libraries exist nowadays, but they all have a different interface and
work with their own matrix representation. These tools are
incorporated into SiMath and adapted such that their interface is
consistent over all tools. For instance, all clustering algorithms
are initiated by defining a set of parameters and the actual
clustering is done by calling the cluster method with the data matrix.
Currently, SiMath contains modules for PCA (or SVD), matrix
discretisation, SVM training and evaluation, several clustering
algorithms, self-organing map and several general mathematical
utilities.
More information about SiMath and how to download the source code can
Silicos is a chemoinformatics-based biotechnology company empowering
virtual screening technologies for the discovery of novel compounds
in a variety of disease areas.

This makes sense. The technology here is common to many applications and as (Hans De Winter ) says it is foolish to reinvent the wheel. This is exactly the sort of components we need in the discipline. Because they are in C++ and many of use use Java it may make sense to develop these as Web services (REST) as the message overhead is likely to be smaller than the computational cost.
The Blue Obelisk mantra – Open Data, Open Source, Open Standards welcomes contributions in any of these areas.

This entry was posted in "virtual communities", chemistry, open issues, programming for scientists. Bookmark the permalink.

One Response to Silicos contributes Commercial Open Source – thank you

  1. Rajarshi says:

    Having computational backends in the form of WS”s is certainly handy. Something I recently set up at IU was a variety of WS’s using R as the underlying computational engine: regression, classification, clustering, plots etc. The available services are listed at
    http://www.chembiogrid.org/wiki/index.php/Web_Service_Infrastructure
    (somewhere in the middle of the page)

Leave a Reply

Your email address will not be published. Required fields are marked *