Compounds, substances and identifiers

There has been discussion recently (e.g. CAS Discourages Using SciFinder to Help Curate Wikipedia Structures and CAS Numbers and the Wikipedia Project: CAS Validation page) about the use of CAS identifiers and possible alternatives. One suggestion is that CAS numbers could be replaced by InChI (International Chemical Identifier – Wikipedia, the free encyclopedia) strings. This may work in some cases but will fail in many others – this post is to introduce the problem of identifying chemicals and to make it clear there is no simple solution. I’d be grateful for comments.
To start we must recognize what we are talking about. I’ll use Wikipedia definitions and explanations where possible.

It is important to realise the distinction between the two and the variety of language used. I suspect that I use slightly different language. Here are some things I would regard as substances that do not have definite chemical compositions:

  • polystyrene
  • zeolite
  • rust
  • fuming sulfuring acid

This introduces the concept of “sample” – the fact that within a given concept of a many substances the composition can vary. From Wikipedia (Definition of Substance):

Chemical substances (also sometimes referred to as a pure substances) are often defined as “any material with a definite chemical composition” in most introductory general chemistry textbooks.[3] According to this definition a chemical substance can either be a pure chemical element or a pure chemical compound. However, there are exceptions to this definition, a pure substance can also be defined as a form of matter that has both definite composition and distinct properties.[4] and the chemical substance index published by CAS also includes several alloys of uncertain composition.[5] Non-stoichiometric compounds are a special case (in inorganic chemistry) that violates the law of constant composition, and for them, it is sometimes difficult to draw the line between a mixture and a compound, as in the case of palladium hydride.

The most appropriate authority in Chemistry is the International Union of Pure and Applied Chemistry which spends much effort on publishing terminology and nomenclature.  Its Gold Book defines some thousands of terms. There is no direct entry for substance. Phrases include:

polymer
A substance composed of macromolecules .
product
A substance that is formed during a chemical reaction.
 
amount of substance,n
Base quantity in the system of quantities upon which SI is based. It is the number of elementary entities divided by the Avogadro constant
analgesic
Substance which relieves pain, without causing loss of consciousness.
 

which implies that while  “substance”  often refers to materials of fixed composition, this is not always true.
It is the uncertainty and variability in chemicals that makes it impossible to have a single system for identifying them. Much chemistry is based on the observation that many substances can be created in a pure form or purified and that this process is reproducible between laboratories. Good examples are crystalline compounds (and crystallisation is an excellent method of purification).
PubChem recognises that substances and compounds are not identical and has a different system of identifiers for each – we’ll return to this later.
The methods of identifying chemicals include:

  • names. IUPAC is the primary authority for this.
  • chemical structure. This is a representation of the types and interrelations (bonding or spatial relationship) of the atoms within the material.
  • authority-assigned identifiers (CAS, PubChem)

Within all of these there can be wide ranges of generality or specificity. It’s reasonable to talk of “zeolite” as a chemical substance, and it’s also reasonable to subclassify it into many different subtypes. In many cases it may be necessary to refer to particular sample which, unless one has access to the physical material, are often best described by their observed properties (e.g. spectra, thermodynamic properties, etc.)
In some sub categories there is a useful correspondence between  chemical structures and names (e.g. in organic chemistry of pure compounds, and this is where InChI and CAS most overlap). But in solid and bulk materials we shall need alternative approaches to InChI – it is not a universal solution.

 
 
This entry was posted in Uncategorized. Bookmark the permalink.

2 Responses to Compounds, substances and identifiers

  1. Charles says:

    Can I add something?
    A chemical compound is a substance consisting of two or more different elements chemically bonded together in a fixed proportion by mass.
    There are some exceptions to the definition above. Certain crystalline compounds may be treated as chemical compounds despite varying in composition according to the presence or otherwise of elements trapped within the crystal structure. Some compounds regarded as chemically identical may have varying amounts of heavy or light isotopes of the constituent elements, which will make the ratio of elements by mass vary slightly. A compound therefore may not be completely homogenous, but for most purposes in chemistry it can be regarded as such.
    Not all molecules are compounds. A diatomic molecule of hydrogen, represented by H2, is homonuclear — made of atoms of only one element, so is not regarded as a compound. Compounds are pure substances that contain two or more elements combined in a definite fixed proportion.

  2. pm286 says:

    (1) Thanks – the discussion continues on this blog

Leave a Reply

Your email address will not be published. Required fields are marked *