I am speaking tomorrow at Lille to a group of Digita Humanists:
Séminaire DRTD-SHS
« Les données de la recherche dans les humanités numériques »
Journée du 21 avril 2015 : « Maîtriser les technologies pour valoriser les données »
Lieu : MESHS (salle 2), 2 rue des Canonniers, 59000 Lille
=====
I always wait to meet the audience before deciding what to say precisely but here are some themes:
- Most research and scholarship is badly published, not reaching the people who need it and not creating a dialogue or community of practice. This is a moral crime and leads to impoverishment of the human spirit and the health of the citizens of the world.
- The paradox of this century is that we have the potential for a new Digital Enlightenment, but in the Universities we are collaborating with those who, for their own personal gain, wish to restrict the distribution of knowledge. The large publishing corporations, taking support from media corporations are building an infrastructure which they monitor and control.
- We have the technical means to break out of this. In our contentmine.org we can scrape the whole of the literature published every day; create a semantic index for searching and extract facts in far greater number than humans can ever do.
- We are held back by the lack of vision, and our solution lies not in science, but in humanities. We lack a communal goal, communal values.
How can we harness the vision of Diderot and the Enlightenment and the radicalism of Mai 1968? How can we create the true culture of the digital century?
I shall show some of the tools we have developed in contentmine.org which can scrape and “understand” the whole of scholarly publication. In the UK , after an intense battle against the mainstream publishing community, we have won the right for machines to read and analyze electronic documents without fear of copyright. I express this as:
“THE RIGHT TO READ IS THE RIGHT TO MINE”
We need this in the rest of Europe – Julia Reda MEP has recently proposed this (and much more). There is again intense backlash – so we need philosophers, political scientists, historians, literary studies, economists to show why this freedom has to triumph.
All our tools are Open (Apache2, CC BY, CC0) and we have shown that “anyone” can learn to use them within a morning. They are part of the technical weaponry of digital liberation.
Theses are the major resource over which publishers have no control. Much of our scholarship is published in theses as well as in journals; and much is only published in theses. My single laptop can process 5000 theses per day – or 1 million per year – which should suffice.
The solution will come through human-machine symbionts – communities of practice who understand what machines can and cannot do.