Why is scientific data so badly communicated?

Ross Mounce and I are starting to extract content (“content-mining”) from BMC journals. [Why BMC only? Because most of the other major publishers refuse to let us do it even when we subscribe.] [Why not PLoS? For technical informatics reasons which I have communicated to PLoS and which they have taken on board.]

I am going to appeal frequently for like-minded people to form a community of Open Content Miners, so if you are interested, let us know.

Anyway we are going through BMC Evolutionary Biology and looking at data types. We are optimistic in general.

DISCLAIMER: I shall use examples from BMC because this is all I can access. I shall frequently be critical – BMC is no better or worse in most of these. My criticism of Elsevier, Wiley, Springer, Nature, RSC, ACS, etc. is an order of magnitude worse.

Anyway – here’s the first diagram I came across. I’ll say later HOW we extract, but for the moment look how badly the information is presented. That’s partly because of the slavery of the printed page (and “print” is the evil word because authors and publishers expect reader to print the page). Tell ME what you think is suboptimal about this figure (I have at least 3 complaints, some of which are very common). The diagram should scale to a size where the text is (just) readable.

UPDATE:

People have been reading this – I’d value your comments.

 

Posted in Uncategorized | 2 Comments

Let’s get rid of CC-NC and CC-ND NOW! It really matters

Many people now feel that the CC-NC and CC-ND licences are counterproductive – I estimate that in “Open Access” alone this is costing >> 1 billion USD in forbidding re-use and general paralysing FUD. Here’s a great exposition of why we must reform CC licences NOW!

Note that in science there is additional argument and evidence (http://www.pensoft.net/journal_home_page.php?journal_id=1&page=article&SESID=&type=show&article_id=2189 ) and  Prof. Mike Carroll PLoS Biology, *Why Full Open Access Matters*, at

> http://www.plosbiology.org/article/info:doi%2F10.1371%2Fjournal.pbio.1001210.

 

Here’s Danny’s arguments – if you haven’t time just be convinced that CC-NC doesn’t work, stops the honest people working with material intended for them , is ambiguous, etc.

 

Danny Piccirillo

   

to Open

via: http://freeculture.org/blog/2012/08/27/stop-the-inclusion-of-proprietary-licenses-in-creative-commons-4-0/

 

———- Forwarded message ———-
From: Students for Free Culture <webleader+rss-bot@freeculture.org>
Date: Mon, Aug 27, 2012 at 3:15 AM
Subject: [FC-discuss] Stop the inclusion of proprietary licenses in Creative Commons 4.0
To: discuss@freeculture.org

Over the past several years, Creative Commons has increasingly
recommended free culture licenses over non-free ones. Now that the
drafting process for version 4.0 of their license set is in full gear,
this is a “[a once-in-a-decade-or-more opportunity][1]” to deprecate the
proprietary NonCommercial and NoDerivatives clauses. This is the best
chance we have to dramatically shift the direction of Creative Commons
to be fully aligned with the [definition of free cultural works][2] by
preventing the inheritance of these proprietary clauses in CC 4.0′s
final release.

The concept of free culture has its roots in the history of free
software (popularly marketed as “open source software”), and it’s an
important philosophical underpinning to the CC license set. As with free
software, the word “free” in free culture means free as in freedom, not
as in price, but Creative Commons has not [set or adhered to any
standard or promise of rights][3] or taken [any ethical position][4] in
their support of a free culture. The definition of free cultural works
describes the necessary freedoms to ensure that media monopolies cannot
form to restrict the creative and expressive freedoms of others and
outlines [which restrictions are permissible or not][5]. Although
Creative Commons provides non-free licenses, the fact that they
recognize the definition reveals a willingness and even desire to
change.

Creative Commons started off by focusing much more on flexibility for
rightsholders, but since its early days, the organization has moved away
from that position. Several projects and licenses have been retired such
as the Sampling, Founders’ Copyright, and Developing Nations License.
It’s obvious that something like Founders’ Copyright which keeps “all
rights reserved” for 14 years (before releasing into the public domain)
is not promoting free culture. Giving rightsholders more options and
easier ways to choose what rights they want to give others actually
reinforces permission culture, creates a fragmented commons, and takes
away freedom from all cultural participants.

**What’s wrong with NC and ND?**

The two proprietary clauses remaining in the CC license set are
[NonCommercial][6] (NC) and [NoDerivatives][7] (ND), and it is time
Creative Commons stopped supporting them, too. Neither of them provide
better protection against misappropriation than free culture licenses.
The ND clause survives on the idea that rightsholders would not
otherwise be able protect their reputation or preserve the integrity of
their work, but all these [fears about allowing derivatives][8] are
either permitted by fair use anyway or already protected by free
licenses. The [NC clause is vague][9] and survives entirely on two even
more misinformed ideas. First is rightsholders’ fear of giving up their
copy monopolies on commercial use, but what would be considered
commercial use is necessarily ambiguous. Is distributing the file on a
website which profits from ads a commercial use? [Where is the line
drawn][10] between commercial and non-commercial use? In the end, it
really isn’t. It does not increase the potential profit from work and it
does not provide any better protection than than Copyleft does (using
the ShareAlike clause on its own, which is a free culture license).

The second idea is the misconception that NC is anti-property or anti-
privatization. This comes from the name NonCommercial which implies a
Good Thing (non-profit), but it’s function is counter-intuitive and
completely antithetical to free culture (it [retains a commercial
monopoly][11] on the work). That is what it comes down to. The NC clause
is actually the closest to traditional “all rights reserved” copyright
because it treats creative and intellectual expressions as private
property. Maintaining commercial monopolies on cultural works only
enables middlemen to continue enforcing outdated business models and the
restrictions they depend on. We can only evolve beyond that if we
abandon commercial monopolies, eliminating the possibility of middlemen
amassing control over vast pools of our culture.

Most importantly, though, is that both clauses do not actually
contribute to a shared commons. They oppose it. The fact that the ND
clause [prevents cultural participants from building upon works][12]
should be a clear reason to eliminate it from the Creative Commons
license set. The ND clause is already the least popular, and
discouraging remixing is obviously contrary to a free culture. The
NonCommercial clause, on the other hand, is even more problematic
because it is not so obvious in its proprietary nature. While it has
always been a popular clause, it’s use has been in slow and steady
decline.

Practically, the NC clause only functions to cause problems for
collaborative and remixed projects. It prevents them from being able to
fund themselves and locks them into a proprietary license forever. For
example, if Wikipedia were under a NC license, it would be [impossible
to sell printed or CD copies of Wikipedia][13] and reach communities
without internet access because every single editor of Wikipedia would
need to give permission for their work to be sold. The project would
need to survive off of donations (which Wikipedia has proven possible),
but this is much more difficult and completely unreasonable for almost
all projects, especially for physical copies. Retaining support for NC
and ND in CC 4.0 would give them much more weight, making it extremely
difficult to retire them later, and continue to feed the fears that
nurture a permission culture.****

**Why does this need to happen now?**

People have been vocal about this issue for a long time, and awareness
of the problematic nature of ND and NC has been spreading, especially in
the areas of [Open Educational Resources][14] (such as OpenCourseWare)
and [Open Access to research][15]. With the percentage of CC-licensed
works that permit remixing and commercial use having [doubled][16] since
Creative Commons’ first year, it’s clear that there is a growing
recognition that the non-free license clauses are not actually
necessary, or even good.

Both NC and ND are incompatible with free licenses and many, if not the
vast majority, of NC and ND licensed works will not be relicensed after
CC 4.0, so the longer it takes to phase out those clauses, the more
works will be locked into a proprietary license. There will never be a
better time than this. Creative Commons has been shifting away from non-
free licenses for several years, but if it does not abandon them
entirely it will fail as a commons and [divide our culture][17] into
disconnected parts, each with its own distinct licence, rights and
permissions granted by the copyright holders who ‘own’ the works.

In December of 2006, Creative Commons implemented a subtle difference
between the pages for free culture and non-free licenses: green and
yellow background graphics (compare [Attribution-ShareAlike][18] to
[Attribution-NonCommercial][19]). This was also when they began using
license buttons that include license property icons, so that there would
be an immediate visual cue as to the specific license being used before
clicking through to the deed. In February of 2008, they began using a
seal on free culture licenses that said “[Approved for Free Cultural
Works][20]”, which was another great step in the right direction. In
July of this year, Creative Commons released a [completely redesigned
license chooser][21] that explicitly
says whether the configuration
being used is free culture or not. This growing acknowledgement of free
vs. non-free licenses was a crucial development, since being under a
Creative Commons license is so often equated with being a free cultural
work. Now, retiring the NC and ND clauses is a critical step in Creative
Commons’ progress towards taking a pro-freedom approach.

The NC and ND clauses not only depend on, but also feed misguided
notions about their purpose and function. With that knowledge, it would
be a mistake not to retire them. Creative Commons should not depend on
and nurture rightsholders’ fears of misappropriation to entice them into
choosing non-free CC licenses. Instead of wasting effort maintaining and
explaining a wider set of conflicting licenses, Creative Commons as an
organization should focus on providing better and more consistent
support for the licenses that really make sense. We are in the perfect
position to finally create a unified and undivided commons. Creative
Commons is at a crossroads.This decisive moment will in all likelihood
bind their direction either being stuck serving the fears that validate
permission culture or creating a shared commons between all cultural
participants.

We don’t want the next generation of the free culture movement to be
saddled with the dichotomies of the past; we want our efforts to be
spent fighting the next battles.****

**What should we do? **

There have been lots of discussions on the CC-license list about
promoting free culture licenses and discouraging proprietary ones. A
couple of proposals have been made to encourage the use of free licenses
over the non-free ones.

One is a rebranding of the non-free licenses. They could be
differentiated in a much more significant way than it currently is, such
as referring to NC and ND as the “Restricted Commons” or “Limited
Commons” or some variant thereof. License buttons could also be color
coded in the same way that license pages are (green for free culture
licenses, yellow for proprietary ones). Another proposal is to rename
NonCommercial to something more honest such as CommercialMonopoly.

While these proposals and other ideas are certainly worth supporting, we
should not lose sight on our ultimate goal: for Creative Commons to stop
supporting non-free licenses. We should not feel like this is impossible
to achieve at this point, as it will be much more difficult to do later.
More people than ever are starting to advocate against proprietary CC
licenses, and there is clear evidence and reasoning behind these
arguments. We have the power to prevent the inclusion of non-free
clauses in this upcoming version of the Creative Commons License set.

To join us in resisting the inclusion of proprietary clauses in CC 4.0,
there are a few important things you can do:

  * Send a letter to the [Creative Commons Board of Directors][22] about
your concerns.

  * Publish your letter or a blog post on the issue (and send it to the
list below)

  * Join the Creative Commons licenses development list to participate
in discussions of the 4.0 draft:
[http://lists.ibiblio.org/mailman/listinfo/cc-licenses][23]

  * Contribute to the CC 4.0 wiki pages:
[http://wiki.creativecommons.org/4.0][24]

   [1]: http://governancexborders.com/2011/09/17/cc-global-summit-2011
-pt-iii-discussing-the-non-commmercial-module/

   [2]: http://freedomdefined.org/Definition

   [3]: http://mako.cc/writing/toward_a_standard_of_freedom.html

   [4]: http://mako.cc/copyrighteous/20040917-00

   [5]: http://freedomdefined.org/Permissible_restrictions

   [6]: http://freedomdefined.org/Licenses/NC

   [7]: http://robmyers.org/2010/02/21/why_nd_is_neither_necessary_nor_s
ufficient_to_prevent_misrepresentation/

   [8]: https://creativecommons.org/weblog/entry/26549

   [9]: http://news.cnet.com/8301-13556_3-9823336-61.html

   [10]: http://lists.ibiblio.org/pipermail/cc-
licenses/2005-April/002215.html

   [11]: http://robmyers.org/2008/02/24/noncommercial-sharealike-is-not-
copyleft/

   [12]: http://www.techdirt.com/articles/20110704/15235514961/shouldnt-
free-mean-same-thing-whether-followed-culture-software.shtml

   [13]:
https://commons.wikimedia.org/wiki/Commons:Licensing/Justifications

   [14]: http://kefletcher.blogspot.com/2011/10/why-not-nc-non-
commercial.html

   [15]: http://www.plosbiology.org/article/info:doi%2F10.1371%2Fjournal
.pbio.1001210

   [16]: https://creativecommons.org/weblog/entry/28041

   [17]:
http://www.freesoftwaremagazine.com/articles/commons_without_commonality

   [18]: https://creativecommons.org/licenses/by-sa/3.0/

   [19]: https://creativecommons.org/licenses/by-nc/3.0/

   [20]: https://creativecommons.org/weblog/entry/8051

   [21]: https://creativecommons.org/weblog/entry/33430

   [22]: mailto:Hal%20Abelson%20%3Chal%40mit.edu%3E%2C%20Glenn%20Otis%20
Brown%20%3Cgotisbrown%40gmail.com%3E%2C%20Michael%20Carroll%20%3Cmcarrol
l%40wcl.american.edu%3E%2C%20Catherine%20Casserly%20%3Ccathy%40creativec
ommons.org%3E%2C%20Caterina%20Fake%20%3Ccaterina%40caterina.net%3E%2C%20
Brian%20Fitzgerald%20%3Cbrian.fitzgerald%40acu.edu.au%3E%2C%20Davis%20Gu
ggenheim%20%3Cakhawkins%40mac.com%3E%2C%20Joi%20Ito%20%3Cjoi%40ito.com%3
E%2C%20Lawrence%20Lessig%20%3Clessig%40pobox.com%3E%2C%20Laurie%20Racine
%20%3Cracine%40lulu.com%3E%2C%20Eric%20Saltzman%20%3Cesaltzman%40pobox.c
om%3E%2C%20Annette%20Thomas%20%3CAnnette%40macmillan.co.uk%3E%2C%20Molly
%20Van%20Houweling%20%3Cmsvh%40pobox.com%3E%2C%20Jimmy%20Wales%20%3Cjwal
es%40wikia.com%3E%2C%20Esther%20Wojcicki%20%3Cesther%40creativecommons.o
rg%3E%2C%20

   [23]: http://lists.ibiblio.org/mailman/listinfo/cc-licenses

   [24]: http://wiki.creativecommons.org/4.0

URL: http://freeculture.org/blog/2012/08/27/stop-the-inclusion-of-proprietary-licenses-in-creative-commons-4-0/
_______________________________________________
Discuss mailing list
Discuss@freeculture.org
http://lists.freeculture.org/mailman/listinfo/discuss
FAQ: http://wiki.freeculture.org/Fc-discuss

 


_______________________________________________
okfn-discuss mailing list
okfn-discuss@lists.okfn.org
http://lists.okfn.org/mailman/listinfo/okfn-discuss

Posted in Uncategorized | 3 Comments

#vivo12 my talk “Reclaim Our Scholarship”

“Power corrupts; Powerpoint corrupts absolutely” (Tufte)

My talk is through HTML links – you need to be on the web.

Reclaim Our Scholarship

[was: Bottom-up collaborations in the Internet Age]

VIVO12, Miami, US

Peter Murray-Rust, Unilever Centre for Molecular Sciences Informatics, University of Cambridge

Themes

  • Restrictive practices in #scholpub are costing billions
  • The direct fruits of science are denied to 99% of the human race – everywhere. The “scholarly poor”
  • Most scientific data (80%+) is lost. Much of the rest is walled up by publishers
  • The only answer is REAL OPEN.
  • Individuals and small groups can change the world
  • VIVO could be a key point in this change

Text for today – from SciVal (Elsevier) flyer
… “[In VIVO] [Elsevier] combine rich Scopus(R) publication histories, your institution’s own content and individual researcher data in semantic form, and share this information as linked open data.”

PMR mission – to create robots to liberate all published factual scientific content. “liberation software”

addendum
Zookeys CC-NC

Posted in Uncategorized | Leave a comment

#vivo12 What I might be going to say

IMPORTANT – INCLUDES INSTRUCTIONS FOR #VIVO12 DELEGATES

I am at the #vivo12 conference in Miami – VIVO (http://vivoweb.org/ Connect/Share/Disover) and will give the plenary lecture tomorrow. I have been given free rein and originally called this “Bottom-up collaborations in the Internet Age” or something. I’ll review some of the things that make collaboration work and the converse (there is of course no absolute recipe for success).

I haven’t prepared anything and don’t know what I shall say. This is deliberate.

I want to get a feel for the delegates and also for the potential for future action. The delegates seem to be (roughly in order):

  • University librarians
  • University techies
  • University managers
  • Commercial vendors into universities

VIVO seems to be very University-centric. I’m going to change my title to:

“Reclaim our scholarship” – and I’ll blog more on that later.

MY PRESENTATION – PLEASE READ

I have requested a second screen. It will be smaller and dedicated to one task – a twitter stream or twitterwall. This is so everyone can see what others are saying about the issues I raise and also me. The tweets represent the collective electronic consciousness of the delegates AND also those “listening” from outside. It can be extremely effective. In this way we get our message out, get feedback, talk to ourselves, etc.

If you’ve never used twitter (and some delegates haven’t – no shame in that) go to https://twitter.com/ and get yourself a username and password. If you don’t want to sign up to yet-another-social-networking-site there is no shame. Be aware that the whole world can read what you write. So be careful – people have been prosecuted for libel or harassment. But most people use it every day without problems. Use the string “#vivo12” in your posts as then everyone here will get it in the stream. If you want to try, connect today and give it a try. [It also interacts with other social media such as googleplus and facebook].

The tweets (only those with #vivo12) will be exposed as they come in. This gives a way for:

  • Making comments on what I present
  • Broadcasting what I present to the outside world
  • Broadcasting your comments to the outside world
  • Getting comments from the outside world. People outside might wish to raise issues for me/us to comment on (we have 15 mins discussion).

Twitter is ephemeral (though National libraries, I think, archive tweets). There is a tool http://en.wikipedia.org/wiki/Storify – it would be great to have a volunteer storify the session since there will be no video recording (I might record myself audio). Storify would then allow an editor to build their own account of what I said, how you responded, where we ended up. You do not have to agree with me (many already don’t!).

 

Please comment on this – you may have ethical views on Twitter/Storify. The technical issues – we hope – will be small although the bandwidth is somewhat variable. Please try not to download movies during my presentation.

More later I hope.

Posted in Uncategorized | 1 Comment

Skolnik Symposium ACS 2012 #skolnik2012

Henry Rzepa and I are running the ACS Herman Skolnik award symposium tomorrow in the Philadelphia Convention Centre. We intend that this is inclusive and so will be running a twitterfall or similar so we can keep in touch with each other and also with the outside world. We also shall have some external presentations (at least 2 and maybe 4 – we shan’t know till tomorrow). We also have 2 demo sessions.

Therefore the primary coordination will be twitter on #skolnik2012. All internal audience will have wi-fi access (we are assured).

The plan is the following:

  • There is unlikely to be time for questions within the speaker’s 15 minutes (Henry and I get a huge 20 mins!). Handovers will be rapid and slick.
  • If you have in the internal or external audience a question or discussion point tweet it during or after the speaker.
  • We (or the speaker) will try to announce each speaker on the Tweet stream
  • If you are a speaker, use Twitter to answer your questions after your presentation.
  • Speakers may wish to post URLs on Twitter rather than expecting people to copy them.
  • There won’t be a video or audio stream so audience please comment on what is happening.
  • The chair will be RUTHLESS and switch you off when your time is up.

Speakers have been asked to keep the changeovers quick – so no need to spend time on lengthy anecdotes, etc.

[I haven’t yet decided what to say, so I’ll post that tomorrow.]

Posted in Uncategorized | Leave a comment

AnimalGarden present “The Chemical Chinese Room” at the American Chemical Society meeting

Henry Rzepa and I have been awarded the Herman Skolnik award of the ACS and will be running a 1-day symposium next week. In my own talk (20 mins) I’ll be looking to the future under the theme “Can we build artificially intelligent chemists?” 3 minutes of this has been hijacked by #animalgarden who have adapted John Searle’s idea of the “Chinese Room” http://en.wikipedia.org/wiki/Chinese_room to chemistry.

Here’s Frog and Zog asking Magic Chemical Panda a chemical question and getting an answer.

Who is MCP? What does he look like?

All will be revealed next Tuesday.

Meanwhile here’s a question for anyone:

“What’s the biggest current obstacle to creating artificial intelligent chemistry?”

Please make suggestions. The answer may surprise you.

Posted in Uncategorized | 2 Comments

Fee-free scholarly publishing

A short crowded blog post. I’m off the the ACS shortly and then to VIVO and am concentrating on presentations. Hope to blog those in the normal way.

After kicking off a discussion of publishing models on the [GOAL] mailing list with the traditional Green/Gold/Hybrid approaches I suggested that we should be looking at a “Fee-free” model. This isn’t new and it’s not my idea. Here’s Peter Suber reviewing the situation:

See William Walters and Anne Linvill (August 2010):  “While just 29 percent of OA journals charge publication fees, those journals represent 50 percent of the articles in our study.”

http://crl.acrl.org/content/early/2010/09/14/crl-132.abstract

So there are large numbers of journals that charge nothing to authors and nothing to readers. And they want to remain that way. The problem is that the volume of articles is largely free-supported. We thus have a strange paradox:

  • Lots of small journals prosper without charging excessive fees
  • The large journals charge more, rather than less.

The reason is that there is a vanity market. Fee-supported journals have to argue they produce a better product. And the only product that differentiates them is the market for glory. We hear mantras such as “Researchers must be able to publish where they want.”

Why? If the article is worth reading it will be found. The journal is now primarily a glory label – used not for the excellence of the contents but as an artificial market to determine career progression and funding in universities.

The economic cost of an article is about 250 USD. (Acta Crystallographica do it for 150 USD). Anything higher than that is either inefficiency or sheer profit.

[Stop ranting, PMR and get to the point…]

The point is that lots of people want to create [e-only] publications without these artificial commercial constraints. So we’ve agreed to explore how this can be done, on the OKF’s open Access list (where open Access always means BOAI-compliant). http://lists.okfn.org/pipermail/open-access/2012-August/000788.html summarises the discussion.

We want to collate information on what works at present. And summarize it so that would be “publishers” can build on the work of others. We are envisaging a “Handbook of fee-free Open Access publishing” which helps people explore sustainable models.

One of the really valuable things about fee-free publishing is that no-one is in it for the money. So the “predatory Open Access” publishers – who publish low quality or even pirated OA material – have no place. There’s no pressure on people to find fees up front. There’s no pressure on libraries – everything is free.

If Steve Coates can get 250,000 people to build a fee-free map of the world, why can’t we do it for scholarly pub. Why do editors have to come from academia? Why do we have to stick with the outmoded “journal” when we have all the tools to manage articles more productively. (If people want journals they can collect together the free material however they want). If the arXiv can manage papers for (I think) 7 USD we don’t need to charge 10000 USD as Nature does. That’s for the glory.

And let’s remember that university libraries take > 10,000,000,000 USD every year from taxpayers and students to pay for journal subscriptions. And only 1 % of the population can read it. If the fee-free community had 1% of this (100 million USD) and distributed it between – say – 1,000 fee-free startups – just to get them going we would see some fantastic developments. So libraries, shouldn’t you be looking to create something new rather than simply fuelling the old, inefficient and avaricious?

Anyway – please join the discussion on open-access. You won’t get shouted down with political Open Access slogans.

PROGRESS (two areas I have been urging on this blog)

  1. Glad to see OCLC releasing a part of Worldcat under a libre licence (ODC-BY)
  2. http://eu.wiley.com/WileyCDA/PressRelease/pressReleaseId-104537.html Wiley have changed their “Fully Open Access” model to one that really is BOAI-compliant (CC-BY). Well done Wiley. Other publishers, be brave – it won’t hurt and may gain you some credit.

 

 

 

Posted in Uncategorized | Leave a comment

Is this paper Open Access?

I have been sent a PDF of which I reproduce the front matter. Is it “Open Access”? Note that if I get the answer wrong I might get lawyers’ letters accusing me of copyright theft, breach of contract, etc. Note that I personally an unable to answer some of these questions authoritatively (To avoid typing here is the metadata:

NeuroImage Volume 51 Issue 1 15 May 2010, Pages 91–101

 

There are some subsidiary questions:

  • Can I post it on the web? For commercial use? For any use?
  • Is it Green? Or Gold? BOAI compliant? Or something else? How did you tell?
  • Is it gratis? Is it libre? If so what permissions have been relaxed?
  • Can I send someone a copy? Anyone? Or just a non-commercial?
  • Does its location affect whether it is Open Access?
  • Has someone paid for Open Access? Would their funders be satisfied?

 

Posted in Uncategorized | 5 Comments

Elsevier replies about hybrid #openacess; I am appalled about their practices. Breaking licences and having to pay to read “Open Access”

[This is a long post – in summary Elsevier breach licences and charge readers for Open Access. I also ask the world to verify and amplify my conclusions].

A week ago I wrote to Elsevier’s Division of Universal Access about their “hybrid Open Access”. Put simply, this is where authors pay Author Processing Charges (perhaps 3-5000 USD) to have their article made “Open Access”. Because “Open Access” is a poorly defined term I asked Elsevier for information about their “hybrid Open Access”.

My motivation is that I wish to examine compliance from the side of (a) the publisher (b) the authors, to see whether funder mandates are complied with. There is now increasing emphasis on policing this area, and this needs to be done with robotic tools. Hence I wish to examine:

  1. Whether a paper published as Open Access complies with reasonable expectations from the authors and funders
  2. Whether funders are getting compliance from their authors/grantees.

This is not easy and depends on having a machine-readable list of all or subsets of the published literature (perhaps 40 million documents in STM) and also knowing precisely what conditions apply to the article. Since the article can exist without the context of the publisher’s web-site (or an institutional repository site) it is essential that the article carry all information sufficient to determine whether it can be redistributed, redisplayed or re-used. (Many institution repositories are seriously broken when it comes to the rights attaching to an article).

I therefore wished to get a list of all articles published by Elsevier under the hybrid scheme. Since the authors have paid so much money, I would that their articles would be made accessible (accessible means discoverable and identifiable as hybrid OA). I would expect a competent publisher to have created a public list of such articles. I would also expect the articles to be identified *in the article* as hybrid. If I had spent 3000+ USD for making my article visible I this is the absolute minimum I would expect.

Most publishers’ websites are extremely bad at giving contact details for formal requests. Since I happen to know that Elsevier has a “Division of Universal Access” and I happen to know its director’s name I wrote to it. Note that I regard this purely as a business transaction, similar to asking my bank for details of its policy on selling Personal Protection Insurance. Indeed , I give Elsevier the same respect and trust as I give my bank – they are providers of paid-for services, not partners or collaborators.

The director wrote back by mail and on my blog. (I use the blog as a non-repudiation mechanism). /pmr/2012/07/27/i-ask-elsevier-for-their-list-of-articles-published-as-hybrid-open-access/#comment-113189. I’ll reproduce it in full, but highlight the essential parts and comment (PMR). Please read to the end as it gets worse as you go through.

“Hi Peter, and thanks for your message. Here are the answers to your 5 questions”.

1. What , if any, is Elsevier’s precise name for this scheme and where is it described?

Our hybrid open access publishing scheme is not currently a branded programme. We refer to these as ‘sponsored articles’. You can read more about these here (http://www.elsevier.com/wps/find/intro.cws_home/open_access) and here (http://www.elsevier.com/wps/find/authorsview.authors/sponsoredarticles).

PMR: Almost every publisher has meaningless and ambiguous phrases describing “open access”. “Sponsored access” is used by other publishers to mean “free to authors and readers sponsored by the journal”. It is impossible to determine from these words what the state of the article is.

2. How many articles in total have been published under this scheme?

2010 sponsored article numbers are available here (http://www.elsevier.com/framework_authors/Sponsoredarticles/pdfs/sponsoredarticlesNEW.pdf) and the 2011 sponsored article numbers are available here (http://www.elsevier.com/framework_authors/Sponsoredarticles/pdfs/Sponsored_Articles_2011.pdf).

PMR: A total of about 2000 articles. Note that “hybrid Gold” is normally the necessary mechanism to satisfy funders unless there are “Gold” OA journals available.

3. What explicit licence, if any, is used on the articles?

The majority use a bespoke license which is like a CC-NC license and described here (http://www.elsevier.com/wps/find/authors.authors/sponsoredarticles_user).

PMR. I agree that this is close to a CC-NC licence, but it is not technically a licence. I am unable to verify if this information appears in the paper itself, see the much more serious concern in #4

Some sponsored articles in physics only come with a CC-BY license and in these cases the CC-BY license is indicated in the body of the article at the end just before the reference, see this example (http://www.sciencedirect.com/science/article/pii/S0168900210008910).

PMR. I find:

The “OA” logo appears to be specific to Elsevier. On following the link to “Open Access policy” I find:

THIS IS COMPLETELY UNACCEPTABLE. The authors have purchased a CC-BY licence. Elsevier have broken this by refusing commercial re-use and distribution.

Now the body of the article:

There is no mention of Open Access in the front matter:

“All rights reserved”. Anyone reading this would expect that the reader has no rights.

By this stage most people would assume this was closed access of some sort. However I have been told to look at the last page of the main text:

This is near illiterate. “It is distributed under the terms of the Creative”. ? But – assuming that the key phrase is the next – CC-BY 3.0 FORBIDS Elsevier to apply restrictive clauses. If you were the authors, would you feel that Elsevier had given you service for your very large APC?

 

4. How are the articles labelled in the Elsevier journal (i.e. how is the licence and the Open Access information made apparent)?

There is an open access symbol and link in the top right of the article. For an example see here (http://www.sciencedirect.com/science/article/pii/S0168900207017020).

THIS IS APPALLING. This is asserted to be an Open Access article by the Director of Universal Access. And I have to pay 36 USD to read it. (I obviously cannot verify the logo as I have to pay for Open Access).

5. Where is the machine-readable list of all articles published under this scheme? I wish to download and analyze all of them.

ELS: Through your affiliation with Cambridge University you are able to text mine all our content, not only the open access articles.

PMR: This is nothing to do with my question. I wish a list of all the “hybrid Open Access” articles so I can determine how many – if any I am allowed to read without paying. Without such a list it is impossible to discover Elsevier’s hybrid articles. Whether this is deliberate, or simply don’t-care, Elsevier are taking tens of billions out of the academic system and giving readers little or nothing in return.

At this time we do not publish a separate machine-readable list of all sponsored articles, but I will share this suggestion with appropriate colleagues involved in our various open access infrastructure projects.

If you would like to conclude our earlier discussion about a bulk data download to facilitate your text mining, please do let me know.

PMR: You can conclude it – I cannot. I asked for permission and you failed to give it to me. My librarians have no interest in or resources to “negotiate” with you.

ANALYSIS

Elsevier’s appalling practice speaks for itself. There are only the following explanations:

  • Elsevier break licences knowingly and deliberately charge for “open access”. [Readers will remember that Elsevier also created fake journals].
  • Elsevier are incompetent or uninterested in running Open Access properly.

I predict that Universal Access will plead “this was an isolated mistake; forgive us and we’ll correct it”. Rubbish. It is not acceptable to charge people for things they have no right to charge for. It is unacceptable to break licences. Whatever the motives it shows that at best they don’t care. It’s morally the same as “sorry I knocked you down because my brakes didn’t work.”

READERS, CAN YOU HELP?

  • Do YOU know of any Elsevier hybrid articles?
  • Do they have the Open Access logo mentioned above? Yes, it seems they do.
  • Do non-subscribers have to pay for them? Only some of them – we don’t yet know how many (2 yes, 1 no)

UPDATE:

  1. See Steve Pettifer’s reply below (his article wasn’t paywalled)
  2. From Twittersphere “
    @andrewjpage had similar problem (charging for sponsored article), complained, now open access: http://ow.ly/cKtZ0

So there are at least two examples of Elsevier charging for Open Access – and I’d be surprised if there weren’t more.

Posted in Uncategorized | 18 Comments

Content Mining – 1 : The tree

Ross Mounce and I are now geared up for content mining of the bioscience literature, and I’ll be giving you some idea of the technology. There are many publishers who either won’t let us content mine (ACS, RSC) or who are “very helpful” but either don’t reply or mumble (e.g. Universal Access). So we are starting with good honest CC-BY as promoted by Gulliver Turtle and friends at BMC. So all our illustrations will come from BMC. (Why not PLoS? These are technical, not political restrictions we’ll analyse later. We’d like to do PLoS).

So I shall give simple discussions on archetypal content mining and today we will start with the tree. (or Larch if you follow the Pythons).

Here’s a tree: from

And here’s the bottom bit magnified:

 

Questions:

  1. What makes a tree?
  2. Can machines determine whether it’s a tree (a) from the tree (b) from the caption (which we have as running UTF-8 text?
  3. Why is this particular example not well suited for content mining?
  4. Are there other disciplines than bioscience where trees (in the abstract) occur?
Posted in Uncategorized | Leave a comment