#animalgarden are very excited. They are going to Beyond-the-PDF-2 #btpdf2 to give a demo of AMI2. AMI2 is a tool to read the whole scientific literature and extract the factual data. That will be legal in the UK in October 2013, so they are preparing by reading Open Access articles. AMI2's mascot is AMI, the kangaroo. Here she talks to Chuff the @okfn_okapi (Open Knowledge expert) and Sleepless the project manager. Sleepless will contact PMR for human problems. Note that Chuff is bouncy, full of enthusiasm and wants to see the whole world Open. AMI has no emotions and doesn't understand humans. She can be taught algorithms, heuristics, semantics, logic. She never gets tired, angry, bored or makes a mistake (unless PMR has given her something incorrect).

The demo will show how Scientific Technical Medical (STM) PDFs can be converted to semantic form automatically. #ami2 has done a lot of this already. She wants PMR to work hard until #btpdf2 to give her as much power as possible. That means analysing a range of publishers. #ami2 has already done the CC-BY Open Access publishers BioMedCentral (#animalgarden likes Gulliver Turtle), eLIfe (no animal yet) and PeerJ(The blue-monkey-with-no-name). Now they have come to #elsevier:

S: Elsevier is not an OpenAccess publisher so almost all their content is closed and AMI2 cannot analyse it.

C: But there are some hybrid articles which are author-paid to be Open.

A: Where is the list? I can download them and analyse them.

S: There is no list.

C: Why not. If the authors have paid for them, Elsevier should list them.

S: That's what PMR thought. He asked @wisealic , the director of Universal Access, (http://blogs.ch.cam.ac.uk/pmr/2012/08/05/elsevier-replies-about-hybrid-openacess-i-am-appalled-about-their-practices-breaking-licences-and-having-to-pay-to-read-open-access/ ) on 2012-08-05

5. Where is the machine-readable list of all articles published under this scheme? I wish to download and analyze all of them.

At this time we do not publish a separate machine-readable list of all sponsored articles, but I will share this suggestion with appropriate colleagues involved in our various open access infrastructure projects.

C: How many articles were there? And how much had the authors paid?

S: about 2000 articles. Assuming 3000 USD each that is SIX MILLION DOLLARS

C: So Elsevier can afford to pay someone to make a list?

S: Yes. You can get a lot of human work for a small fraction of 6million dollars

C: I would have thought that Elsevier would want people to read these artciles.

S: That's what PMR thought. But obviously #elsevier thinks otherwise (PMR: or doesn't think). Could AMI make a list by reading all the #elsevier splash-pages and seeing which are Open.

C: Elsevier doesn't label them consistently. Every publisher is different. So we don't know what to look for.

PMR: and why should WE do Elsevier's work for them. They take our money …

S: No rants, PMR. This is a constructive discussion. Last August @wisealic said they were working on the problem:

August 8, 2012 at 3:04 pm  (Edit)

Hi Peter,

We are currently investing in a major overhaul of our open access infrastructure and until that upgrade is complete do have various systems limitations in presenting open access content. To make our open access more clear and visible in the interim we've created various work-arounds. You have found two problems with these – many thanks for flagging them for our attention.

A: Six months have passed. What have they changed?

S: Here's @wisealic a few days ago

What is Elsevier doing to ensure that OA content in hybrid journals is discoverable by institutions that do not subscribe to that title? My colleagues inform me that they are discussing the issue with an array of vendors to find a solution.

A: I understand an Array. It is a sorted list over which I can iterate. So Elsevier has


If you tell me what a vendor is and how to get the articles from it I can iterate over all the articles.

S: A vendor is somebody who sells things to libraries. I do not understand this reply, I'll ask PMR:

PMR: I do not understand it either. Elsevier does not need vendors to count 2000 articles and make a list. They simply need the will to do it and give the community what they have paid for and deserve. I do not feel Elsevier is behaving in a constructive manner. Since some of their "open access" articles appear still to be behind a paywall. I think such a list might give problems.

S: AMI, I cannot give you a milestone date for when Elsevier will release a list of its Open Access articles. Maybe @wisealic will at list give us a short list.

C: Does she read this blog?

S: I will ask PMR to mail her.

A: If you give me a list I will download the articles and iterate over them.

PMR: Let us hope @wisealic gives us some answers that we can act on.



  1. That's a great piece of software!
    I'd like to discuss this with you.. Have you tried with PloS and Frontiers papers?

    Looking forward to met you at #btpdf2


  2. alf says:

    It's possible to query PubMed Central to find all of Elsevier's "Sponsored Articles" (nearly 4000, so far) that have been deposited there: http://www.ncbi.nlm.nih.gov/pmc/?term=Elsevier+Sponsored+Documents%5Bfilter%5D

    I fetched all the XML into a GitHub repository last year: https://github.com/hubgit/elsevier-sponsored-documents and have added a list of the article file URLs: https://gist.github.com/hubgit/5081985 - PDFs are available for around 1000 of the articles.

