Wiley: your supporting information for chemistry isn't satisfactory

It has become increasingly common for journals to offer – or require – “supporting information” (“supplemental data”, etc.) as an adjunct to the “full-text” article. This is now an essential part of much publications and this post shows how when it isn’t done fully it harms the community.
In the days of real paper it was difficult to publish experiemental data. The publisher had a real need to keep down the number of pages and so large lists of tables, spectra, etc took up costly space. I don’t know the exact date (ca. 1970-5?) but I remember the crystallography community, presumably through the International Union (IUCr), starting to insist that authors and journals should capture the essential data. This was initially though real paper or faxes to the journal, who would then store them (I wonder what the faxes look like now?!) or by deposition with the crystallographic structure databases or both. I suspect that the IUCr led the publishing field here, but others emulated this with requirements to deposit protein sequences (late 1970s?). At that stage it was often difficult to coerce authors and I remember Nature being one of the last journals to require their authors to send in such material.
With electronic publication the economics change. There is no resource-limitation on what can be deposited – it is purely a balance between the interests of readers, publishers and authors. Here is what the blogosphere (commenting on TotallySynthetic) have to say about supplemental info in a Wiley journal (Angewandte Chemie == ACIEE). I have kept only comments which refer to suppinfo (SI) – the numbering does not reflect the original.

Geigerin

geigerin.jpg
Deprés and Carret. ACIEE, 2007, EarlyView. DOI: 10.1002/anie.200702031.
[… TotSynth’s comments clipped]

41 Responses to “Geigerin”

  1. Spiro Says:
    August 7th, 2007 at 1:23 “In their previous papers, they had to use a metal to take diphosgene to dichloroketene, but in this case, a bit of ultrasound worked rather well.”Ultra sound avoids the use of activated zinc (ref 14), but you definitely need a metal.
    Maybe I am wrong but there is no supp info available, as often with Angewandte :-(
  2. aa Says:
    August 7th, 2007 at 1:58 spiro, the supporting info is available at http://www.wiley-vch.de/contents/jc_2002/2007/z702031_s.pdfand i think the procedure for the 2+2 is found in their previous methodology paper,referenced in this one.
  3. Spiro Says:
    August 7th, 2007 at 2:49 aa, thanks for your dedication, but I had read these “supporting information” before writing my discontentment.
    It is just that I do not consider this to be a decent supporting information section, even though the three procedures they show are the most important of the article.
    I do not blame the authors, just the journal. If my boss tells me to write a paper without supp info, I cheer. But this is a bad habit IMHO.
    For example, I am perplex about transformation c in scheme 3, especially when I read ref 19. One way or another there may be something which is missing in the conditions (acid?), and a written procedure could clarify things.
  4. carbazole Says:
    August 7th, 2007 at 3:09 The lack of supp info in ACIEE is really frustrating. If you’ve done a total synthesis, why can’t the supp info include any procedures for making compounds not already found in the literature? If I’m doing a lit search, and I find a reaction in Org Lett that I can use, I cheer because it will have a procedure most likely. If it’s for ACIEE, I groan, because the supp infos are so spotty.
  5. Gilgerto Says:
    August 7th, 2007 at 14:10 I totally agree with you carbazole, I cannot conceive that in 2007, a supp. info for a total synthesis includes only 2 procedures and 4 nmr. It is clearly a lack of rigour from Angew…
  6. aa Says:
    August 7th, 2007 at 15:31 spiro- right, sorry about that. yes, the lack of SI in ACIE is pretty terrible. i especially hate when their a reaction in a tot syn that you would like to try and can’t get a detailed procedure.
  7. willyoubemine Says:
    August 7th, 2007 at 15:52 Supporting Info is all that really matters in these papers anyway right? I mean its nice that someone made something, but its irrelevant if irreproducible bc of spotty SI.
  8. HPCC Says:
    August 7th, 2007 at 16:15 [previous]: The ultimate best example was last year’s synthesis by James La Clair… Deoxoudol, or the molecule-that-shall-change-name-upon-criticism-of-its-synthesis! [1]
  9. JamesB Says:
    August 7th, 2007 at 16:42 Isn’t an author obliged to provide experimental detail/spectral data on request for published reactions? My ex-boss certainly behaves like it – and believe me, the hours I put into scanning spectra and compiling supp. info means the SI isn’t “spotty” in the slightest.
  10. carbazole Says:
    August 8th, 2007 at 4:40 Sure, being required to provide spectra/procedures on request is fine, but why can’t it be included online at the time of publication? Are their servers running low on hard disk space? It was different before online publication obviously, journal pages were precious. Why should I have to email someone for something that really should be provided in the first place?
  11. Jose Says:
    August 8th, 2007 at 5:31 It makes me wonder if that might be why so often high level papers get sent to Angew over JACS by certain groups in particular….
  12. willyoubemine Says:
    August 8th, 2007 at 15:56 Jose is onto something.Anyone see baran’s SI for Chartelline in JACS. It was immaculate, the way SI should be.
  13. tom Says:
    August 8th, 2007 at 18:23 Most of the time when I email someone for supporting info I don’t get it.. I emailed one of sharpless’s underlings for SI on allyic azide precursors and got not a single response.

PMR: [1] Deoxoudol – this (or hexacyclinol) is a molecule whose structure was seriously disputed and where the use of calculation and resynthesis relied on supporting information to help decide the problem. I haven’t followed this in detail, but it would be fair to say that many in the community have serious doubts about the original publication.
PMR: There are some very clear and cogent messages from the blogosphere:

  • they take scientific procedures – especially reproducibility – very seriously. Correctness in reporting is critical. The blogosphere has periodic opinions that certain groups represent their work in a better light than the raw facts.
  • in many cases the data are almost all that matters. They are used to repeat work both for testing and because people want some of the material. If the recipe is wrong careers can be blighted. Many young workers have been required by their supervisors to repeat work that is “wrong” and have suffered as a result when they can’t get it to work.
  • publishing supplemental info is critical. It’s very tedious, but the effort is worth it to the community as a whole. And journals are expected to help enforce this policy.

So some plaudits:

  • The SI in JACS (Journal of the American Chemical Society) is very good. (I refrain from saying “excellent” only because it’s in PDF, not machine-understandable).
  • The crystal structures from IUCr and RSC (Royal Soc Chemistry) are top-class. We have processed over 50,000 with virtually no detectable errors. They are an epitome of how data should be published. We are working on the ACS ones – they are also pretty good, with a few buglets. And we have collected these in crystaleye.

The Wiley suppinfo that the blogosphere has taken issue with consists of 7 pages of which the first is:
aciee.GIF
as you can see the information – DATA – is copyrighted by the publisher. I have mentioned this before, but I note that I lay myself open to being pursued by Wiley for showing any of this scientific information without their permission (see my post Sued for 10 Data Points for current practice in Wiley journals). So I will only post a very little bit and hope this counts as fair use:
spectrum5.GIF
(it’s only a very small amount of 1 page, promise!).
So to add to the blogosphere’s concerns this is an awful way of transmitting scientific information. To be fair I can find this sort of hamburger elsewhere, but they are right that it makes it much harder to use if the data are fuzzy (that’s real fuzz on the spectrum).
It’s clear that the authors have been selective in their SI. I can’t read the original paper (I could if I made the effort to get a password) but there are certainly 11 compounds and only details for 4 in the SI. That means, essentially, that there isn’t enough detailed information to repeat the work.
There is no TECHNICAL reason why all the information cannot be included. The spectrum was a born-digital cow with 32000 points and it’s been squashed to a messy hamburger. More than half the lab info has been held back. And the publisher makes it very difficult (copyright, passwords) to navigate all this.
The chemists at the bench deserve better. We know how to publish spectra and crystal structures without losing information. Let’s see some journals take a pro-active stance here!

This entry was posted in chemistry, data, open issues. Bookmark the permalink.

4 Responses to Wiley: your supporting information for chemistry isn't satisfactory

  1. Liquidcarbon says:

    I agree with everything they blame ACIEE for. But about this particular spectrum? Are you disappointed that it is scanned or that there is a handwritten commentary (quite appropriate in this case)? Would you recommend sending .fid archives as NMR data?

  2. pm286 says:

    (2)
    LC: Are you disappointed that it is scanned
    PMR: (scanned = printed) Yes. It’s born digital. It’s possible to dump a megabyte of 1D data or 10 Mbyte of 2D data or FID. Even with 100 compounds per paper that’s only 1 Gbyte. The astronomers ship > 100 terabytes per day.
    LC: or that there is a handwritten commentary (quite appropriate in this case)?
    PMR: I have no problem with annotation. I encourage it. But this is not searchable. Note, also that if we had a digital spectrum we could see what the coupling of the C11 product is.
    LC: Would you recommend sending .fid archives as NMR data?
    PMR: Certainly, as long as there is free software to process it. Failing that a digital 1D spectrum is a lot better than e-paper.
    Notice that in the same SI there is a picture of the crystal structure. I do not know whether the coordinates have been sent to a database, but if they haven’t that would not be satisfactory as publication.
    If the chemical community wanted deposition of spectra it would be technically feasible and a minute fraction of the cost of publication. That’s what I am campaigning for…

  3. Liquidcarbon says:

    I think you’re a bit too optimistic about .FIDs, especially 2D. I hope you mean “send FIDs together with the processed spectra, not .FID alone. For example, I don’t know how to process 2D .FIDs, and there are people who have never processed any .FIDs because NMR service does it for them. I would say that a good solution would be an interactive program that would make possible to hit an URL “1H spectrum of compound ##” in SI and choose between “save FID”, “process automatically online” and “process manually online”. Well, maybe another 5-10 years?

  4. pm286 says:

    (3)
    LC: I hope you mean “send FIDs together with the processed spectra, not .FID alone.
    PMR: Yes. Data are cheap. As an example the crystallographers send drawings of the structure, but also coordinates for those who can use and display them, and then increasingly the actual diffraction images (many GBytes, but these are the raw data). Specialist crystallographic software is needed to process the images but it’s always possible to revisit them and possibly refine the interpretation. In the same way it can be possible to re-interpret spectra.

Leave a Reply

Your email address will not be published. Required fields are marked *