I am honoured to have been invited to speak at CopyCamp2017, “The Internet of Copyrighted Things” . I’ve not been to CopyCamp before, but I’ve been to similar events and I’m delighted to see it is sponsored by organisations, some of which I belong to, that are fighting for digital freedom. In these posts I’ll show why copyright has failed science; this post shows why knowledge is valuable and must be free.
I’m giving a workshop on Thursday and talking on Friday (after scares from Ryanair) and I’m blogging (as I often to) to clear my thoughts and help add to the static slides. This is the latest in a 40-year journey of hope, which is increasingly destroyed by copyright maximalism. I am being turned from an innovative scientist who had a dream of building something excitingly new to an angry activist who is fighting for everyone’s rights. I can accept when science doesn’t work because it often just doesn’t; I get angry when mega-capitalists are using science as a way to generate money and in the wake destroying something potentially wonderful.
Here’s the story. 45 years ago I had my first scientific insight – working with Jack Dunitz in Zurich – that by collecting many seemingly unrelated observations (in this case crystal structures) I could find new science by looking at the patterns between them (“reaction pathways”). This is knowledge-driven research, where a scientist takes the results of others and interprets them in different ways. It’s as old as science itself, exemplified in chemistry by Mendeleev’s collection of the properties of compounds and analysis in the Periodic Table of the Elements. Mendeleev didn’t measure all those properties – many will have been reported in the scientific literature – his genius was to make sense out of seemingly unrelated properties.
40 years ago chemists started to use computers to carry out simple chemical artificial intelligence – analysis of spectra and chemical synthesis. I was entranced by the prospect, but realised it relied on large amounts of knowledge to take it further. I was transformed by TimBL’s vision of the Semantic Web – where knowledge could be computed. I moved to Cambridge in 1999 with the long-term aim to create “chemical AI”. I created a dream – the WorldWide Molecular Matrix – where knowledge would be constantly captured, formalized and logic or knowledge engines would extract, or even create, new chemical insights.
To do this we’d need automatic extraction of information using machines – thousands of articles or even more. In 2005-2010 I was funded (with others) by EPSRC and JISC to develop tools to extract chemical knowledge from the scientific literature. It’s hard and horrible because scientific papers are not authored to be read by machines. I have spent years writing code to do this and now have a toolset which can read tens of thousands of papers a day (or more if we pay for clouds) and extract high quality chemistry. This chemistry is novel because it’s too expensive and boring to extract by hand and would be an important addition to what we have. As an example Nick Day in my group built CrystalEye which extracted 250,000 crystal structures, improved them and published them under an Open Licence – we’ve no joined forces with the wonderful Crystallography Open Database http://www.crystallography.net/cod/ . Later Peter Corbett, Daniel Lowe, and Lezan Hawizy built novel, Open, software for extracting chemistry from the text of papers.
So now I have everything I want – thousands of scientific articles every day, maybe 10-15% containing some chemistry, and a set of Open tools that anyone can use and improve. I’m ready to try the impossible dream – of building a chemical AI…
What will it find?
NOTHING. Because if I or anyone use it without the PUBLISHER’s permissiom, the University will be immediately cut off by the publisher because …
… because it might upset their market. Or their perceived dominance over researchers. This isn’t a scare or over-reaction – there are enough stories of scientists of many disciplines being cut off arbitrarily to show it’s standard. One day 2 years ago the American Chemical Society’s automatic triggers cut off 200 universities. Publishers send bullying mails “you have been illegally downloading content” (totally untruee), or “stealing” (also untrue).
This is now so common that many researchers and even more librarians are scared of publishers. This blog has outlined much of this in the past and it’s not getting better. My dream has been destroyed by avarice, fear and conservatism. I’ll outline the symptoms, what needs to be done and urge citizens to own this problems and assert that they have a fundamental right to open scientific knowledge.
My slides at CopyCamp: https://www.slideshare.net/petermurrayrust/contentmining-and-copyright-at-copycamp2017 provide additional material.
-
Recent Posts
-
Recent Comments
- pm286 on ContentMine at IFLA2017: The future of Libraries and Scholarly Communications
- Hiperterminal on ContentMine at IFLA2017: The future of Libraries and Scholarly Communications
- Next steps for Text & Data Mining | Unlocking Research on Text and Data Mining: Overview
- Publishers prioritize “self-plagiarism” detection over allowing new discoveries | Alex Holcombe's blog on Text and Data Mining: Overview
- Kytriya on Let’s get rid of CC-NC and CC-ND NOW! It really matters
-
Archives
- June 2018
- April 2018
- September 2017
- August 2017
- July 2017
- November 2016
- July 2016
- May 2016
- April 2016
- December 2015
- November 2015
- September 2015
- May 2015
- April 2015
- January 2015
- December 2014
- November 2014
- September 2014
- August 2014
- July 2014
- June 2014
- May 2014
- April 2014
- March 2014
- February 2014
- January 2014
- December 2013
- November 2013
- October 2013
- September 2013
- August 2013
- July 2013
- May 2013
- April 2013
- March 2013
- February 2013
- January 2013
- December 2012
- November 2012
- October 2012
- September 2012
- August 2012
- July 2012
- June 2012
- May 2012
- April 2012
- March 2012
- February 2012
- January 2012
- December 2011
- November 2011
- October 2011
- September 2011
- August 2011
- July 2011
- May 2011
- April 2011
- March 2011
- February 2011
- January 2011
- December 2010
- November 2010
- October 2010
- September 2010
- August 2010
- July 2010
- June 2010
- May 2010
- April 2010
- August 2009
- July 2009
- June 2009
- May 2009
- April 2009
- March 2009
- August 2008
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- February 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- August 2007
- July 2007
- June 2007
- May 2007
- April 2007
- December 2006
- November 2006
- October 2006
- September 2006
-
Categories
- "virtual communities"
- ahm2007
- berlin5
- blueobelisk
- chemistry
- crystaleye
- cyberscience
- data
- etd2007
- fun
- general
- idcc3
- jisc-theorem
- mkm2007
- nmr
- open issues
- open notebook science
- oscar
- programming for scientists
- publishing
- puzzles
- repositories
- scifoo
- semanticWeb
- theses
- Uncategorized
- www2007
- XML
- xtech2007
-
Meta