I always enjoy having visitors to the Unilever Centre and encourage people to visit. Yesterday we had a visit from ca 16 staff and Masters students from the Pratt Institute in New York. They were here as part of a 2-week visit to Britain hosted by Anthony Watkinson of University College in London. They had a background of cultural studies and essentially no science. Nonetheless Anthony had asked me to put on something that would give them an insight into some of the practices and challenges in scientific scholarly publishing and related issues.
I hadn’t heard of the Institute, so I went to WP and found:

Charles Pratt (18301891) was an early pioneer of the natural oil industry in the United States. He was founder of Astral Oil Works in the Greenpoint section of Brooklyn, New York. He joined with his protégé Henry H. Rogers to form Charles Pratt and Company in 1867. Both companies became part of John D. Rockefeller’s Standard Oil in 1874.
Pratt is credited with recognizing the growing need for trained industrial workers in a changing economy. In 1886, he founded and endowed the Pratt Institute, which opened in Brooklyn in 1887.

Charles Pratt, Founder
Charles Pratt, Founder

This resonates with other visionaries of the time – I have a long association with Birkbeck College in London:

Working as a doctor in London, Birkbeck, with others, established the London Mechanics Institute in November 1823 – of which he was the first President. The Mechanics Institute concept was quickly adopted in numerous other cities and towns across the UK and overseas, but his association with the ground-breaking London institution was marked by it being renamed the Birkbeck Literary and Scientific Institution in 1866 (now, as Birkbeck College, part of the University of London).

Well, the one similarity is that we are “in a changing economy” so I hope that Charles Pratt would have looked favourably on what we did…
We are very well equipped for workshops at UCC and have 16 machines and a projector/beamer which can be booked for sessions. So rather than my pontifcating I prepared some hands-on. You can do this at home – it only needs a web-browser.
I started by asking them what significance they might attach to the number:
Not all of them got it immediately, so I asked them to Google for it and, of course, it’s a DOI for a scientific article. I then asked them to see what the components were – abstract, full text in HTML, full text in PDF, etc. Could they read the full text? Yes. If they went back to a hotel could they read it? They quickly realised no. Could they tell from the display that the difference was due to the fact that the university had a subscription to the journal. Yes – there was a rubric saying so, though I doubt that many undergrduates or staff in most institutions would notice it.
How did the DOI work? We found the DOI site. What did it provide? Could we have donw all this through Google? etc. How did we know a DOI was unique? How do you identify a book? By ISBN… yes, but how do most of the population identify a book? By its Amazon stock number. Oh…
That’s the sort of disruption that is changing the role of libraries. Names and addresses in TimBL’s world are conflated. The reality is the web. Current reality is an illusion unless it has a URI (==URL).
So now we know what the contents of a paper are , here were some exercises – with discussion in between. (You can use them if you want – this blog is CC-BY).
Goals. To investigate how data is published in leading journals.
Each team (2-3 people) should pick a publisher from:

  • Royal Society of Chemistry (Org. Biomol. Chem)
  • ACS (J. Org Chem)
  • Wiley (Angew. Chemie)
  • Beilstein Journal of Organic Chemistry
  • J Heterocyclic Chemistry
  • Molecules (MDPI)

And see if you can answer the questions:

  • Is the Journal Open or Closed access?
  • Can you access the fulltext? If so is it because you are on the Cambridge network?
  • Does the journa; publish data embedded in full-text?
  • Does it publish data as supplemental/supporting info/data?
  • Is there a licence?
  • Can you understand it?
  • does the author retain copyright?
  • is the supplememtal data copyrighted? by whom?

In some cases the answers are easy. In some cases we genuinely have no idea of the answers. Some publishers are very helpful… so maybe the rest could try to make their policies clearer.
Then we looked at the technology of scientific data.

  • Download OSCAR/Experimental data checker from RSC site (Google for it – I deliberately don’t give URLs any more)
  • Who wrote it? (answer some very bright chemistry undergraduates here)
  • What is the copyright?
  • what is the licence? (Open Source)
  • Run it. This worked a dream on Windows – clicking the jar fired up OSCAR.
  • Use it to extract data from one of the papers you have found (the paragraph needs to describe “Synthesis…” or “preparation of … “

I indicated that text and information extraction also applied to all disciplines. Some asked why journal X did not publish XHTML. I have no idea? Inertia? Bad for business?…
Next exercise:

  • Load CrystalEye (Google)
  • Find the latest issue of your journal.
  • Is it abstracted by CrystalEye?
  • If not, why not? (Because the publisher does not allow or support the publication of crystallographic data as supplemental information)
  • Pick another journal. Find the latest issue in the TOC.
  • Pick a paper.
  • Marvel at Jmol. (Open Source molecular viewer from the Blue Obelisk)
  • follow the DOI in CrystalEye to find the article.
  • Where in the article is the crystal structure described?
  • Where is the CIF file (crystallographic information file)?
  • What is its copyrighted? (Some publishers add their copyright to these files of facts. Did we agree with this? No, we didn’t.)

So now we have a clear idea of the importance of data and the role (positive and negative of the scholarly publisher). We went into more fluid debate… Did we need fulltext in experimental reports – I showed them some of the word-free chemistry publishing we have been doing. Who is going to pay for Open Acess? Who, indeed? What’s the turnover of scholarly publishing (6 billion current units). What’s the turnover of academia? A lot more – (I’d like figures, but lets’ scale by 100). Could the deans, provosts, princials, vice-chancellors, etc. start to take control of this economy. (Observation from the group: many of them are on the boards of prestigious publishers). What will libraries do about subscriptions (TA) and funder-pays (OA) at the same time? Won’t they pay twice? We couldn’t answer that one.
But I bet that Charles Pratt or George Birkbeck would have.

