petermr's blog

A Scientist and the Web

 

Open(?) Notebook NMR – is it really Open Notebook?

  1. Jean-Claude Bradley Says:
    October 25th, 2007 at 2:15 pm eConcerning your comment:
    We have so far shared every piece of data and metadata that we feel is fit to publish. Open does not mean “immediate”.

    True that “open” does not mean “immediate” but the term Open Notebook Science does imply that, following the principle of “no insider information”:
    http://drexel-coas-elearning.blogspot.com/2006/09/open-notebook-science.html

    and a recent rant here:
    http://usefulchem.blogspot.com/2007/10/science-is-about-mistrust.html

    In other words, if you and your student selectively publish results so that there is a public notebook and a private one, that does not fit with ONS.

    Definitions are a hassle sometimes. But as you have shown with the term “Open Access” we have to keep discussing these issues to make sure all assumptions are explicit.

PMR: This is a very important point and I put my hand up… We’ll need to think about it. It may be a matter of timescale – we are moving to make our results available within days, not weeks. But it is also true that we do not, currently, expose enough for any reader in the world to be able to do exactly the same as us at any given time.

However it is very difficult not to have insider information in any project. In out case we do not share our directories with the world. But also J-C does not share his physical samples with the world. For example he would be able to get a crystal structure or spectrum performed before anyone else. He would know the results of this minutes or hours before he told the world. He would notice colour changes in a reaction as it happened and before the rest of the world knew about it. He would know from his colleagues that the reagents used in Drexel had been found to be suspect.

In our case everything we do is, in principle, repeatable. We are going through the process of cleaning the data set. That is the primary scientific operation. And we are asking the world to help. And thanks to those who have done so.

So I will replace the title by “Open Computational NMR”. It’s time for a change anyway.

No Responses to “Open(?) Notebook NMR – is it really Open Notebook?”

  1. Migrating the concept of the “Open Notebook” from wet labwork to computer work can be tricky. In the lab there is a document that chemists use to record their experiments. In my lab there is only one notebook for my group – the same one that the public can see on the wiki. I don’t get the information any quicker than anybody else. So if we find out that a reagent is bad, then that information is updated on the wiki when we find out about it. That experiment and its whole history remains available, even if it is ultimately aborted.

    I don’t think that we can properly understand how science actually works without recording all the mistakes and ambiguous outcomes. And that is one of the objectives of ONS.

    I don’t think that everybody should do this and I think anything more than traditional closed science is positive – it is just a question about definitions so we are all clear about our assumptions.

    Also concerning physical samples, my lab has had a “compound copylefting” policy for a while now:
    http://usefulchem.blogspot.com/2006/11/copylefting-compounds.html

  2. pm286 says:

    (1) Fully agreed. I think we have to work out the synergy and difference between the real-world and total cyberscience.

    Imagine for example that Nick put some calculations on to run overnight and that they were emitted to a public wiki. He gets up in the morning only to find that someone else in – say – Australia has already read them and has created a theory from them. That is not going to happen in the same way in the lab, even if it is speeded up.

    The immediacy of information was, of course, what we were/are trying to do in the – currently dormant – Blue Obelisk Open Access metrics project. There is no reason why that shouldn’t be Open as the motivation of the participants was assumed to be pointing in the same direction. With something like the NMR calculations that is less clear – parts of the methodology could be very volatile.

    This project is, of course only about publicly two weeks old (we spent a long time just getting the jobs to actually run). So I think we are working out the principles on the spot. One of the questions is who has the moral rights to re-use the results – maybe it’s everyone. But you would not expect many chemists to read your data and then publish their own version independently – it would be plagiarism. This is much more difficult int he electronic arena

  3. I’ve come in a bit late on this. I am with Jean-Claude and Bill Hooker I think. I would call this as it stands an ‘Open’ or ‘Public’ experiment rather than Open Notebook Science. This is not to say it is a bad thing. And the motivation for holding back a little on the data is a very good and reasonable one. There is also a grey area that Bill noted which is that obviously data is not made immediately available but that our approach is that it should be made available as rapidly as is practicable.

    I see the slogan ‘No insider information’ as a goal to work towards rather than necessarily achievable. It is a challenging one but it is what we aim for. We are working towards getting our analytical instruments to autopost to our blog so if I can make an analogy here. If a student of mine puts on an analysis overnight the results ideally would be published directly to the blog as they come off the instrument. It is possible that someone in Australia (or California) would see these, notice that we have discovered a new enzyme activity/new drg target inhibitor and then claim the observation.

    We explicitly take this risk. In particular for some of the large facility experiments I am planning I will put up raw and partially processed data that it will take me some months to get through the analysis of – someone else may beat me to it. But if we think this through. They could claim the discovery (and to do so would have to do it rapidly – via a blog/wiki). They would have to refer to the dataset (because they won’t have the equivalent dataset) and so they would have to make the observation public in non-peer reviewed form. For the deliberate spoiler I think you can argue that there would be a rapid and very negative public response.

    Two cases where there is potential difficulty. Someone being ‘helpful’ by making an observation that I would have made (basically the obvious conventional data analysis). This means you feel obliged to give credit. I would say this is still fine to include as a students work in a thesis but would feel obliged to give credit (authorship) in a publication. But there is clearly a very large grey area here. We want people to find things we’ve missed – this is part of the reason we are doing this. And there are many cases where someone sees something that is obvious in hindsight but it is very difficult to pin down whether you would have seen it unless you were looking.

    The second difficult area is when do you feel that data is ‘fair game’ for re-use. If I leave a piece of interesting data on the blog for six months and make no comment and publish no paper does this mean someone else can have a go and feel free to go with it, perhaps publish independently? 12 months? 18 months? I think there is a need to develop or evolve some sort of code of good practise here. We don’t want people having to ask permission every time before playing with our data – but we want them to play nicely giving due credit where appropriate. Perhaps we should tag datasets as ‘I’m done here – feel free to go at it’ or ‘Anybody got any ideas?’. I will try to post on this if I can find some time over the next few days.

  4. [...] if PMR’s group can adopt an Open Notebook Science approach to Wikipedia analysis as he did recently with the NMR analysis. In that way we’ll be able to jointly track our efforts as we work together to help the [...]

Leave a Reply