Data-driven science

I’ll be writing more about this later. Catalysed by an email from Douglas Kell at Manchester – we share the same problem – that data-driven science is a second-class activity. He picked up
paper from on data-driven science and laments:

the obsession of biologists, and especially molecular cell biologists, with hypothesis-DEPENDENT science. At least we are hoping to fix some of this in Systems Biology.
It is still hard for them to understand that it is difficult to make hypotheses about molecules you do not even know exist, and thereby do something REALLY new, and as you say they do not really recognise that building tools is VERY important.

So I have been lamenting the lack of data in chemistry  – he and Stephen Oliver laments the culture (BioEssays 26:99–105, 2003 Wiley Periodicals, Inc. BioEssays 26.1 99 – I expect it’s inaccesible to half the readers of this blog):

It is considered in some quarters that hypothesis-driven methods are the only valuable, reliable or significant
means of scientific advance. Data-driven or ‘inductive’
advances in scientific knowledge are then seen as
marginal, irrelevant, insecure or wrong-headed, while
the development of technology—which is not of itself
‘hypothesis-led’ (beyond the recognition that such tools
might be of value)—must be seen as equally irrelevant to
the hypothetico-deductive scientific agenda. We argue
here that data- and technology-driven programmes are
not alternatives to hypothesis-led studies in scientific
knowledge discovery but are complementary and iterative
partners with them. Many fields are data-rich but
hypothesis-poor. Here, computational methods of data
analysis, which may be automated, provide the means of
generating novel hypotheses, especially in

I hadn’t realised it was so bad elsewhere. Now I realise it is. No wonder we struggle in cyberscholarship – no data, and even if we have it it’s not “proper” research.
More later.
[I do not normally blog private emails but this is an obvious exception.]

