Open NMR: what metadata do we want?

One of the reasons that CrystalEye works is that the metadata contributed by the authors (and required by the publishers, through IUCr) is superb. Is there general agreement about what metadata should be captured for NMR spectra or shifts? The JCAMP files potentially contain a lot but this depends on their availability – we know of few publishers who accept, let alone publish JCAMP files. On the assumption that this post reaches some publishers who wish to promote good scientific practice in reporting spectroscopy, what do we want.
NMRShiftDB mirrors the summary material  published in the body of chemical papers (and sometimes supplemental). A typical recent entry shows:

Bookmarks
You could bookmark structures if you were logged in!
Details

Spectral Data Additional Data
Molecule (20139616)
nmrshiftdb.cubic.uni-koeln.de_Kolshorn_2007-10-23_02:02:58_0723
Chemical name(s) (5-bromo-2-furyl)methanol
Molecular weight 176.996
Number of all rings, size of smallest set of smallest rings 1, 1
CAS-Number  
Molecule keywords  
Type 13C
Assignment Method 1D shift positions
Solvent Chloroform-D1 (CDCl3)
Additional comments MZ2N-107
Spectrum categories ocmainz inhouse database

The following may be available elsewhere but (for 13C) I would like to see additionally:

  • organization/person depositing data (i.e. not just in filename)
  • date of deposition
  • whether solvent is spiked with reference material (e.g. TMS).
  • method of assignment of peaks (ideally this should be per peak as well as per experiment)
  • known hydrogen counts (this might be from in-house experiment or report in literature).
  • comments

A lot of this can be gleaned from publications. Here’s a typical rubric:

The 1H NMR (300 MHz) or 13C NMR (75 MHz) and DEPT spectra were recorded in CDCl3 using Brucker 300 MHz or JEOL 60 MHz spectrometers with TMS (0 ppm) as the internal standard.

PMR: but that’s all you get. At least we know that they used real TMS.

This entry was posted in nmr, open issues. Bookmark the permalink.

5 Responses to Open NMR: what metadata do we want?

  1. Regarding “The following may be available elsewhere but (for 13C) I would like to see additionally”…is this a request of the person submitting data to NMRShiftDB? I am assuming so since you cannot submit to publishers today (yet).
    Your suggestions:
    * organization/person depositing data (i.e. not just in filename)
    I believe you have to be logged in to deposit. So simply make sure that the submitters name and organization are forced in registration and then extract these directly and make available for viewing. Don’t ask submitters to repeatedly enter information that is already available.
    * date of deposition
    Well..that’s when it is deposited. So extract the data and time of deposition and display it …it’s in the filename
    * whether solvent is spiked with reference material (e.g. TMS).
    Yes. valid request
    * method of assignment of peaks (ideally this should be per peak as well as per experiment)
    I assume you mean whether it’s 1D 1H only manual interpretation, using 2D data (list of data), using automated verification and assignment systems?
    * known hydrogen counts (this might be from in-house experiment or report in literature).
    What’s the definition of “known hydrogen counts”? Do you mean integrals on peaks?
    * comments
    There is a section called Additional Comments already. Do you mean something different?

  2. pm286 says:

    (1) The post was not referring to NMRShiftDB in particular but to metadata in general.
    Hydrogen counts can, I believe, be determined in 13C spectroscopy but I don’t know whether this is done routinely.

  3. Ah-ha…number of attached protons on a carbon – so Quaternary, CH, CH2, CH3. This is routine to run using an APT experiment, attached proton test, or using DEPT. But, how often is it run…lab dependent. If the info is available it should be input for each C13 shift..I agree

  4. PMR:
    The 1H NMR (300 MHz) or 13C NMR (75 MHz) and DEPT spectra were recorded in CDCl3 using Brucker 300 MHz or JEOL 60 MHz spectrometers with TMS (0 ppm) as the internal standard
    WR:
    Read experimental parts always with a certain sense for humor – I know what I am talking about after extracting approx. 85,000 articles.
    Many times even the frequency of the measurement is not clear ( ..run at 90MHz: 90 MHz or 360 MHz machine ? / same with 100MHz ), the ‘Attached Proton Test’ (APT-experiment) is sometimes referred as ‘ATP-experiment’ – may be a checksum for acronyms could help ….
    The example is very well selected – google for ‘NMR Bruker’ (1,610,000 hits), google for ‘NMR Brucker’ (only 78,000 entries) – the homepage of the NMR-company can be found on ‘www.bruker.de’ – ‘www.brucker.de’/’www.brucker.com’ is something different …..
    PMR: but that’s all you get. At least we know that they used real TMS
    WR:
    Be happy – sometimes you get less !
    The problem of NMR is that we have quite reasonable procedures to predict chemical shiftvalues – would be great to have something similar for yields in synthesis 😉 Sorry for being sarcastic already in the morning ……

  5. pm286 says:

    (4) Many thanks Wolfgang…
    I agree completely. Part of my motivation is to show publishers that their material is incomplete and buggy (and that is being generous). We – and OSCAR – have also read thousands of chemical articles and in general they range from just-about-acceptable to awful. (Crystallography is different). So, as you imply, we cannot believe anything completely. The advantage of OSCAR is that it is able to learn some of the common errors – so it can “learn” that “Brucker” is associated with “NMR”.
    It’s actually absurd to write this in free text. It would be much easier to attach the JCAMP to the publications. Even a CIF-like syntax:
    _nmr_manufacturer Bruker
    _nmr_nucleus 1H
    _nmr_frequency_mhz 300
    would be better.

Leave a Reply

Your email address will not be published. Required fields are marked *