- The relationship between human readable material (“full text”) and scientific data. Henry Rzepa and I have coined the term datumemt for the synthesis of these, especially using XML technology. the scientific publication in its current form is inspired by 19th Century orinting technology and “electronic publications” merely encourag outdated ways of communication. Web inspired technologies should revolutionize scientific communication. A particular interest is the development of the “robotic amanuensis” for scientists – personal software which can help indivduals read and publish information effectively.
- Open data, open source, open access, open knowledge. Unless we have free aceess to the primary outputs of science we are denied the opportunity to develop new ideas in informatics-driven science. I have argued publicly that primary scientific data belong to the scientific commons and that they must be free. A corollary is that the outout of funded science is not just full-text but the complete supporting information environment of the experiments.
- “programming for scientists”. Modern scientists are enhanced by “information prosthesis” – the ability to receive and repurpose information. If they are able to “program”, they have greater expressive power. Many of the future skills will not be with conventional programming languages but the tools emerging from the explosion of social and technical operations in today’s web. I’ll be learning from my colleagues and trying to give readers and contributirs a flavour of what is now possible.
- markup languages in (physical) science. These are the handmaidens of the goals above. Currently there are a few main approaches for content: MathML, GML (geography), Scalable Vector Graphics, Chemical Markup Language, AnIML (analytical chemistry), ThermoML (theorchemistry). There are many obvious gaps and I’ll suggest guidelines for any person or group interested in building a language.
- creation and management of virtual communities. I’v been involved with creating and nurturing communities for the last 15 years including BioMOO, the Virtual School of Natural Sciences, XML-DEV, and now the Blue Obelisk. I also believe strongly in Wikipedia and related efforts. I’ll review the features of successful communities and the guidelines for growth.
Welcome!
Welcome to the petermr blog! This is one of a series of blogs
from scientists in the Unilever Centre for Molecular Informatics at
Cambridge. I’ll indicate some of the others on my blogroll. For
now, just note that there is another blog specifically dedicated to
Chemical Markup Language (CML) and I’ll be contributing a lot to that as
well.
This blog will cover a wide range of topics that are mushrooing
on today’s web and which will change the practice of science. Areas
which I expect to blog frequently are:
Good stuff. Glad to see you’re blogging and I’m sure I’ll have some more comments on your posts. As good science is replicable, it’s interesting that in cheminformatics the data are controlled so that it is impossible to replicate the experiment — if I understand correctly.
(1)
Absolutely right, Christine. Chemoinformatics is now de facto irreproducible. It used not to be so, as Rich Apodaca has blogged. I shall write more of this in the next two days.