Textmining: NaCTeM and Elsevier team up; I am worried

A bit over two weeks ago the following appeared on DCC-associates: http://www.mail-archive.com/dcc-associates@lists.ed.ac.uk/msg00618.html

Mon, 07 Nov 2011 09:16:34 -0800

This press release may be of interest to list members. 

 

University enters collaboration to develop text mining applications

07 Nov 2011

http://www.manchester.ac.uk/aboutus/news/display/?id=7627


			

 

The University of Manchester has joined forces with Elsevier, a leading 

provider of scientific, technical and medical information products and 

services, to develop new applications for text mining, a crucial research tool.

 

The primary goal of text mining is to extract new information such as named 

entities, relations hidden in text and to enable scientists to systematically 

and efficiently discover, collect, interpret and curate knowledge required for 

research.

 

The collaborative team will develop applications for SciVerse Applications, 

which provides opportunities for researchers to collaborate with developers in 

creating and promoting new applications that improve research workflows.

 

The University's National Centre for Text Mining (NaCTeM), the first 

publicly-funded text mining centre in the world, will work with Elsevier's 

Application Marketplace and Developer Network team on the project. 

 

Text mining extracts semantic metadata such as terms, relationships and events, 

which enable more pertinent search. NaCTeM provides a number of text mining 

services, tools and resources for leading corporations and government agencies 

that enhance search and discovery.

 

Sophia Ananiadou, Professor in the University's School of Computer Science and 

Director of the National Centre for Text Mining, said: "Text mining supports 

new knowledge discovery and hypothesis generation. 

 

"Elsevier's SciVerse platform will enable access to sophisticated text mining 

techniques and content that can deliver more pertinent, focused search results."

 

"NaCTeM has developed a number of innovative, semantic-based and time-saving 

text mining tools for various organizations," said Rafael Sidi, Vice President 

Product Management, Applications Marketplace and Developer Network, Elsevier. 

 

"We are excited to work with the NaCTeM team to bring this expertise to the 

research community."

 

Now I have worked with NaCTeM, and actually held a JISC grant (ChETA) in which NaCTeM were collaborators and which resulted in both useful work, published articles and Open Source software. The immediate response to the news was from Simon Fenton-Jones

Let me see if I got this right.

"Elsevier, a leading provider of scientific, technical and medical

information products and services", at a cost which increases much faster

than inflation, to libraries who can't organize their researchers to back up

a copy of their journal articles so they can be aggregated, is to have their

platform, Sciverse, made more attractive, by the public purse by a simple

text mining tool which they could build on a shoestring. 

 

Sciverse Applications, in return, will take advantage of this public

largesse to charge more for the journals which should/could have been

compiled by public digital curators in the first instance. 

 

Hmmm. So this is progress.

 

Hey. It's not my money!  

 

[PMR: I think it's "not his money" because he writes from Australia, but he will still suffer]

PMR: I agree with this analysis. I posted an initial response (http://www.mail-archive.com/dcc-associates@lists.ed.ac.uk/msg00621.html )

 

No – it’s worse. I have been expressly and consistently asking Elsevier for

permission to text-mine factual data form their (sorry OUR) papers. They

have prevaricated and fudged and the current situation is:

“you can sign a text-mining licence which forbids you to publish any

results and handsover all results to Elsevier”

 

I shall not let this drop – I am very happy to collect allies. Basically I

am forbidden to deploy my text-mining tools on Elsevier content.

 

P.

 

I shall elaborate on this. I was about to write more, because I completely agree about the use of public money and the lack of benefit to the community. However I have been making enquiries and it appears that public funding for NaCTeM is being run down – effectively they are becoming a “normal” department of the university – with less (or no) “national” role.

However the implications of this deal are deeply worrying – because it further impoverishes our rights in the public arena and I will explain further later. I’d like to know exactly what NaCTeM and the University of Manchester are giving to Elsevier and what they are getting out of it.

This post will give them a public chance – in the comments section, please – to make their position clear.

 

10 thoughts on “Textmining: NaCTeM and Elsevier team up; I am worried

  1. Casey Bergman

    Dear Peter -

    As an OA advocate, I share many of your sentiments in this post, though I would like to make a few points of clarification for you and others regarding this post.

    1. The deal you write about is NOT at the University level but restricted to Nactem (as far as I am aware). It should not be assumed by you or your readers that all Text Mining researchers at the University of Manchester endorse or participate in this partnership with Elsevier.

    2. Similar to you, I have tried in the past to convince Elsevier to allow us to mine and redistribute of data mined from their (sorry, our) content with no luck. This was very a labor intensive and ultimately unproductive interaction. In retrospect, I probably should not have gone down this road, but at the time (~2010) it seemed like Elsevier was making noises to open up their content and I thought that that push from users might help the process along. Personally, I won’t attempt to use Elsevier content in projects run solely from my lab in the future, though I am not opposed in principal to others doing so.

    3. I am aware of others who have been able to arrange deals to re-distribute facts mined from Elsevier content (e.g. the developers of the BioNot system, http://www.biomedcentral.com/1471-2105/12/420). So there may be some light at the end of the tunnel in terms of Elsevier opening their content to mining and redistribution, so it is worth continuing to bang the drum on this issue!

    Best regards,
    Casey

    Reply
    1. pm286 Post author

      “1. The deal you write about is NOT at the University level but restricted to Nactem (as far as I am aware). It should not be assumed by you or your readers that all Text Mining researchers at the University of Manchester endorse or participate in this partnership with Elsevier”

      Sorry – I don’t agree: From the press release (I have highlighted the first two paras and it is clear that the University is claiming this as its own initiative):

      University enters collaboration to develop text mining applications 07 Nov 2011 http://www.manchester.ac.uk/aboutus/news/display/?id=7627

      The University of Manchester has joined forces with Elsevier, a leading provider of scientific, technical and medical information products and services, to develop new applications for text mining, a crucial research tool.

      In any case it is the University of Manchester to whom I will be making FOI requests. They are the legal body.

      “2. Similar to you, I have tried in the past to convince Elsevier to allow us to mine and redistribute of data mined from their (sorry, our) content with no luck. This was very a labor intensive and ultimately unproductive interaction. In retrospect, I probably should not have gone down this road, but at the time (~2010) it seemed like Elsevier was making noises to open up their content and I thought that that push from users might help the process along. Personally, I won’t attempt to use Elsevier content in projects run solely from my lab in the future, though I am not opposed in principal to others doing so.”

      That’s fine as long as you are happy with a very small part of the literature – you may be. I want to publish *all* chemical reactions.

      “3. I am aware of others who have been able to arrange deals to re-distribute facts mined from Elsevier content (e.g. the developers of the BioNot system, http://www.biomedcentral.com/1471-2105/12/420). So there may be some light at the end of the tunnel in terms of Elsevier opening their content to mining and redistribution, so it is worth continuing to bang the drum on this issue!”

      This is very interesting. There is no mention in the article of having either obtained permission from Elsevier or having entered into a collaboration. If you have extra information that there was explicit collaboratiuon, please let us know. Because at present this looks like simple breach of contract (unless their university has different). Unfortunately, since I am championing rights rather than simply unchallenged breaches I cannot adopt this approach.

      Reply
      1. Casey Bergman

        1. Yes, obviously the University acts as the legal representative of any group that signs an NDA with an external body – groups cannot sign contracts themselves. However that does not mean the signed agreement applies to all member of the University. This is my point – clear and simple. So please don’t tar all members of the University of Manchester with the same brush. Moreover, if you are so interested in championing rights, then please protect my right (like many researchers at the University who are not involved in this deal) not to be accused of something I had nothing to do with. I’m sure I could find dirt on some ethically dubious agreement that a group at the U. of Cambridge has made – but I would not hold the entire University of Cambridge or you personally accountable for this. Lastly, I would expect you to be more discerning than to take an academic press release as actually representing the (full) truth of any matter.

        2. There are many means to an end.

        3. I cannot speak for others, so I suggest you contact these authors directly for more information. From the high quality of work from these scientists, I suspect that they are clever enough not to have made any egregious legal blunder.

        Reply
        1. pm286 Post author

          There is no need to get upset. I am not impugning you or probably most employees of the University of Manchester. The only information I have comes from this press release – you suggest it may not “actually represent the (full) truth of [the] matter”. You are welcome to pub,lish additional information on this blog (as long as it is legally acceptable).

          The University of Manchester is the legal entity responsible for this. It appears from the press release that it welcomes this contract and wishes to promote its value. I am not “tarring” any individuals – I have never suggested that you have anything to do with this contract. When I said I did not agree with you, it was your assertion that the agreement was restricted to NaCTeM – which it formally is not. For example I assume that the University, not NacTeM, sets the financial constraints on the contract (e.g. how much public money, if any, is involved). Anyway we shall soon know, I hope.

          Reply
          1. Casey Bergman

            Not upset at all, just trying to prevent readers of this blog from getting the wrong impression that there is an alliance with Elsevier that is promoted universally at the University of Manchester. Also, thanks for clarifying that your issue is with the abstract legal entity of the University of Manchester and not any of the individuals who work there.

            I did not suggest that the press release in question contained inaccuracies. I simply stated that I was surprised that you give it much credence, since surely you must be aware that academic press releases are essentially advertising that contain a mixture of fact and marketing. In this case, it appears that the advertising may be having a more negative effect than anticipated.

            Finally to crux of the matter, whether there is any financial arrangement that should make you worried: from my first-hand experience of considering NDAs (circa 2010) to access Elsevier Sciverse APIs and content at the University of Manchester (independently of Nactem), I can assure you that in our case there was no payment requested by Elsevier and none made by my group, Faculty or the University. I would not have considered going past stage 1 if it had required payment. Therefore, I assume this is the nature of the basic arrangement made by Elsevier with all academic groups, and in the absence of other information what I would assume is referred to in the press release.

          2. pm286 Post author

            >>>I did not suggest that the press release in question contained inaccuracies. I simply stated that I was surprised that you give it much credence, since surely you must be aware that academic press releases are essentially advertising that contain a mixture of fact and marketing. In this case, it appears that the advertising may be having a more negative effect than anticipated.

            You probably know better than me what is going on. I am fully familiar with press releases that contain only fuzz – and you suggest this may be one. However it reads like a formal agreement. I shall know soon enough!

            >>Finally to crux of the matter, whether there is any financial arrangement that should make you worried: from my first-hand experience of considering NDAs (circa 2010) to access Elsevier Sciverse APIs and content at the University of Manchester (independently of Nactem), I can assure you that in our case there was no payment requested by Elsevier and none made by my group, Faculty or the University. I would not have considered going past stage 1 if it had required payment. Therefore, I assume this is the nature of the basic arrangement made by Elsevier with all academic groups, and in the absence of other information what I would assume is referred to in the press release.

            I assume your agreement was similar to that outlined in my previous blog – that all derivative works belong to Elsevier. I have no particular problem with your position – you are probably doing useful science and do not need to make the material public. I don’t accept it myself. I do require to make all results public.

  2. Pingback: Unilever Centre for Molecular Informatics, Cambridge - What is the basis of the NaCTeM-Elsevier agreement? FOI should give the answer « petermr's blog

    1. Casey Bergman

      >> I assume your agreement was similar to that outlined in my previous blog – that all derivative works belong to Elsevier. I have no particular problem with your position – you are probably doing useful science and do not need to make the material public. I don’t accept it myself. I do require to make all results public

      The assumption that we use Elsevier content is incorrect. We did sign the first (of 2) NDAs to see if Elsevier were serious in allowing us to mine and release data, but we *did not* sign the second developer NDA for a number of reasons (many similar to what you outline in your next post) and returned all content/APIs/documentation. To be clear, in my group, we do not use Elsevier content or APIs and make all of the results from our text mining research public.

      Reply
  3. Sophia Ananiadou

    Dear Peter Murray-Rust,

    Let me clarify the facts regarding the collaboration between Elsevier and the University of Manchester.

    1. We do not have privileged access to Elsevier content. NaCTeM has no access to full content from Elsevier for text mining purposes.

    2. Public money has not been used to provide services exclusively for SciVerse. The two services we currently provide to SciVerse, are based on MEDLINE abstracts, are freely available to the international academic community via our web site. These are Acromine and KLEIO and have been available for several years.

    Regards,
    Prof. Sophia Ananiadou
    National Centre for Text Mining, Director
    University of Manchester

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>