CrystalEye links in Chemspider

I have agreed to review the CrystalEye data in Chemspider and before reading this post you should read the background carefully ( CrystalEye and Chemspider). The main points are that Crystaleye was not designed to be redistributable and that the method that we were asked to use (InChI/URL/SDFile) leads to massive semantic loss and possible corruption. I have also undertaken to give an objective review and to make no judgments. This post will not endorse or criticize Chemspider per se.
I do not vouch for the accuracy of information in this post. I believe that what Chemspider had access to was connectionTable-URL pairs. It is possible to use these to download more information from our site if required.
I also stress very strongly that CrystalEye is a crystallographic site and that the relation of crystallography to chemistry is non-trivial.
I shall also assume that the CrystalEye collection in Chemspider might be discovered by someone who was not familar with the organisation and motive of the site.
So to report…After a few minutes I found the collection of CrystalEye under (this link) – I hope it is the correct place to start. It links through to a page describing CrystalEye with further links to our homepage. It describes us as:

Physical Properties (including SAR/QSAR databases)
Information Aggregators
Organizational logo, personal photo or avatar, up to 50K in size. It will be shown on publicly accessible data source web page.
Yes

I do not know what “approved” means but it does not imply endorsement or approval by ourselves. (This is a general point for all aggregrators).
The page starts with the heading (This may cause problems for some readers’ browsers as it is wide):
=======================================================

24539 hits found in 29.86 seconds. Search terms: DATA_SOURCE in (CrystalEye) AND SingleComponent AND NonIsotopic
1 2 3 4 5 6 7 8 9 10
ID Structure Empirical Formula Molecular Weight Monoisotopic Mass, Da LogP ACD/LogD (pH 5.5) ACD/LogD (pH 7.4)
116 C4H9NO2 103.1198 103.063329
ACD/LogP: -0.64
XLogP: -0.70
ALOGPS: -2.99
-3.15 -3.14

=======================================================
I interpret this to mean that there are about 25000-30000 entries from CE which have been put into Chemspider. I do not know the exact number as CS can have multiple links.
The only information in this table that came from CE is the connection table (but not the depiction) and jmol/”cell” – see below. The ID is a Chemspider ID, not a CE link. The other columns are presumably generated by a program. There is nothing on the page to indicate what they mean. The average crystallographer visiting the site might conclude this had nothing to do with crystallography and leave at this stage (there is no link to CrystalEye from this page). I assume the data are computed LogP which is outside most crystallographers’ daily experience or requirements. I make no judgment on whether they are generally useful.
The “Structure” cell contained 4 links. I cannot depict them on this blog – you will have to click. The “jmol” (sic) linked to a page with 3 links “2D” “3D” and “Cell”. Jmol (sic) is a 3D program and should not be used for 2D diagrams. The 3D link did not work in Firefox 2. In IE it appeared to create a molecule in real time, without reference to the crystallography. This is very seriously misleading as a user of a crystallographic resource would expect the molecular structure displayed to be the one in the crystal. The “cell” resource brought up Jmol and displayed the same cell information as on our site. The molecule was depicted with a spurious bond – I do not know whether this is an artefact of the crystallography or Chemspider.
The link “116” takes the reader to a page which combines all the information for this compound (GABA). In this case, but not others, there is probably agreement about its identity. I will attempt to copy salient points:
====================================

Inherent Properties, Identifiers and References
ChemSpider ID: 116
Empirical Formula: C4H9NO2
Molecular Weight: 103.1198
Nominal Mass: 103 Da
Average Mass: 103.1198 Da
Monoisotopic Mass: 103.063329 Da
Quick Links: Permalink Similar Isomers
Systematic Name: 4-aminobutanoic acid
SMILES: O=C(O)CCCN
InChI: InChI=1/C4H9NO2/c5-3-1-2-4(6)7/h1-3,5H2,(H,6,7)
InChIKey: BTCSSZJGUNDROEUHFFFAOYAC

(Details...) Original Reference(s)

Gamma-aminobutyric acid (GABA) is an amino acid and the chief inhibitory neurotransmitter in the mammalian central nervous system. As such, GABA plays an important role in regulating neuronal [snipped, PMR]
====================================
The SMILES and InChI are presumably either used as the primary link for this page or computed. They are not taken from the CrystalEye site (which provides both). I do not know whether they have been verified against our site. Lower down we find:
CrystalEyeLink to Record
which does what it says – links to a page on the CE site. Some people will find the linkage of the textual information on Gaba and the links to the crystal structure useful.
Lower down the page we find:
====================================

Names and Synonyms

Validated by Experts, Validated by Users, Non-Validated, Removed by Users, Redirected by Users, Redirect Approved by Experts

200-258-6 [EINECS/ELINCS]

Acide ami​no-4- but​yrique [French]

butanoic ​acid, 4-a​mino-

Butyric a​cid, 4-am​ino-

g-Aminobu​tyric Acid

g-Amino-n​-butyric ​Acid

Gamma ami​nobutyrate

gamma Ami​nobutyric​ acid

GAMMA(AMI​NO)-BUTYR​IC ACID

gamma-Ami​nobutanoi​c acid

More…

GAMMA-AMI​NO-BUTANO​IC ACID

gamma-Ami​nobutryic​ acid

gamma-Ami​nobutters​aeure

Gamma-ami​nobutyric​ acid [JA​N]

gamma-ami​no-n-buty​ric acid

omega-Ami​nobutyrate

Piperidate

Piperidin​ate

w-Aminobu​tyrate

.gamma.-A​minobutan​oic acid

.gamma.-A​minobutyr​ic acid

.gamma.-A​mino-N-bu​tyric acid

4-aminobu​tanoate

4-Aminobu​tanoic ac​id

4-aminobu​tyrate

4-AMINO-B​UTYRATE

4-aminobu​tyric acid

4-amino-n​-butyric ​acid

56-12-2 [RN]

Aminalon

AMINOBUTY​RIC ACID,​-4-, ALPHA

Butanoic ​acid, 4-a​mino- (9C​I)

GABA

Gaballon

Gamarex

Gamastan

gamma-ami​nobutyrate

gamma-Ami​nobutyric​ acid

gamma-Ami​nobutyric​ acid (JA​N)

gamma-Ami​nobutyric​ acid-car​boxy-14C

Gammagee

Gammalon

Gammalone

Gammar

Gammasol

Gamulin

Mielogen

Mielomade

omega-Ami​nobutyric​ acid

Piperidic​ acid

Piperidin​ic acid

Reanal

w-Aminobu​tyric acid

4-Aminobu​tylate

Less…

(Details...) Database ID(s)

Validated by Experts, Validated by Users, Non-Validated, Removed by Users, Redirected by Users, Redirect Approved by Experts

A2129_SIG​MA

A5835_SIG​MA

A7463_SIG​MA

AI3-26812

C00334

CCRIS 3721

CHEBI:168​65

D00058

DF 468

DivK1c_00​0616

More…

EPA Pesti​cide Chem​ical Code​ 030802

EU-0100005

KBio1_000​616

KBio2_000​429

KBio2_002​997

KBio2_005​565

KBio3_002​190

KBioGR_00​1297

KBioSS_00​0429

Lopac-A-2​129

MLS000028​505

NCGC00015​043-01

NCGC00024​546-01

nchembio.​78-comp12

NINDS_000​616

NSC 27418

NSC27418

NSC32044

NSC45460

NSC51295

SMR000058​285

SPBio_000​996

Spectrum_​000049

Spectrum2​_001208

Spectrum3​_001385

Spectrum4​_000809

Tocris-03​44

ZINC01532​620

Less…

(Details...) Predicted Properties
LogP: ACD/LogP: -0.64 XLogP: -0.70 ALOGPS: -2.99 # of Rule of 5 Violations: 0
ACD/LogD (pH 5.5): -3.15 ACD/LogD (pH 7.4): -3.14
ACD/BCF (pH 5.5): 1 ACD/BCF (pH 7.4): 1
ACD/KOC (pH 5.5): 1 ACD/KOC (pH 7.4): 1
#H bond acceptors: 3 #H bond donors: 3
#Freely Rotating Bonds: 4 Polar Surface Area: 29.54 Å2
Index of Refraction: 1.465 Molar Refractivity: 25.68 cm3
Molar Volume: 92.8 cm3 Polarizability: 10.18 10-24cm3
Surface Tension: 46.2 dyne/cm Density: 1.11 g/cm3
Flash Point: 103.8 °C Enthalpy of Vaporization: 53.43 kJ/mol
Boiling Point: 248 °C at 760 mmHg Vapour Pressure: 0.00798 m

====================================
PMR: I will have more to say on names and synonyms later but for those here I have no particular comment.
Readers should make their own judgment about the value of predicted properties. They should note that GABA is a solid at room temperature and so the concept of surface tension and several other properties is irrelevant. I do not know whether the other properties refer to the solid or liquid states but personally I would not use them for anything. I also observe that a machine reading this page (and even some humans) could easily not notice that the properties were not observed.
In general – apart from the prediction of properties – the aggregation provided for this compound is probably useful to many people though probably not for mainstream crystallographers. It does, however, require a great deal of expert judgment to determine what properties are useful and which are seriously misleading. I would not, for example, recommend its use in undergraduate teaching.
I shall comment on two other entries later and in any case it’s a good point to break as this blog struggles with cut and paste.

This entry was posted in Uncategorized. Bookmark the permalink.

3 Responses to CrystalEye links in Chemspider

  1. In regards to the loss of stereochemistry our initial investigations (to be confirmed) suggest that one of the software tools we use has generated the problem. Contrary to your assertion that Open SOurce software will replace commercial software (http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=1130) and is of higher quality (and in the future it maybe!) the evidence at present is we have a long way to go.
    We changed our processes to use Open Source software and, assuming no user error, has resulted in the stereo issues you are commenting on. I’ll comment on this and your many other posts when I come back from traveling starting tomorrow. Thanks for the feedback.

  2. Just a user tip for WordPress “in any case it’s a good point to break as this blog struggles with cut and paste”. You can easily grab screengrabs and upload the image and link back either to the record view or simply a fullscreen image for the users to review. It’s a lot more attractive and gives less headaches than your approach I’d suggest

  3. pm286 says:

    (1) Thanks – we should be able to make useful mutual progress
    (2) I used to be able to do this but there are local reasons in the
    way access is set up here that prevents my uploading images

Leave a Reply

Your email address will not be published. Required fields are marked *