This is the summary of a presentation I am giving tomorrow at ETD2007 (run by Networked Digital Library of Theses and Dissertations. I’m blogging this as the simplest way of (a) reminding me what I am going to say and (b) acting as very rough record of what I might have presented. (My talks are chosen from a menu of 500+ possible slides and demos and I don’t know which at the start of the presentation so it’s very difficult to have a historical record. The blog carries the main arguments).
Main themes (many of which have been blogged recently):
- the thesis need not be a dull record of a final result but a creative work with lives and evolves until and beyond the “final submission”
- theses should be semantic and interactive, supported by ontologies and go beyond “hamburger PDF”. Theses are computable.
- We must develop communal semantic authoring/creation environments and processes.
- the process should move rapidly towards embracing open philosophies and methodology. Metadata and ontologies should be open.
- young people should be actively involved in all parts of managing the thesis process.
(Harvard Free Culture)
- “Web 2.0” will transform society and therefore the academic process. We must be prepared for this.
- It is not clear that current approaches to “repositories” will help rather than hinder innovation and dissemination of eTheses. They will only be useful for preservation if they are semantic.
In detail scientific theses need support for authoring and validating:
- thesis structure (templating) – e.g. USQ’s Integrated Content Environment ICE system which supports XML/”Word”
- SVG (graphics)
- CML (Chemistry)
- GML (maps)
- Numeric data (various, including CML)
- graphs (various, including CML)
- tables (various, including CML)
- scientific units (various, including CML)
- ontologies and dictionaries (various, including CML)
Some exciting thesis projects:
Disruptive Technology Mathias Klang – this is the first PhD thesis in Sweden to be licensed under a Creative Commons license
Edinburgh Research Archive : Item 1842/433 Magnus Hagdorn open thesis in geosciences
usefulchem » Alicia Holsey One of the first chemistry theses created on a public Wiki
Why PDF is so awful: Organic Theses: Hamburger or Cow?
Subversion (CML project)
Wikipedia – caffeine – (info boxes)
GoogleInChI – semantic chemical search without Google knowing
The power of the semantic Web –dbpedia.org – Using Wikipedia as a Web Database.
Chemical blogspace – overview of exciting developments in chemistry
Local demos including analysis of theses:
- OSCAR1 chemical (thesis) validator, written by undergraduates
- Oscar3 – WWMM chemical linguistics including Named Entity recognition, links to
- The PubChem Project (Free database of chemical structures of small organic molecules and information on their biological activities., etc. and
- Chemical Entities of Biological Interest (ChEBI) a freely available dictionary of molecular entities focused on ‘small’ chemical compounds.
- Bioclipse (including display of molecules from dissertation on 2-Pyridon-katalysierte Esteraminolyse)
- The Blue Obelisk – Bowiki and their Greasemonkey. The Blue Obelisk Data Repository
- The Worldwide molecular matrix (WWMM) and CrystalEye (typical page).
- SPARQL Query Language for RDF on chemistry theses from St Andrews.
- MACiE Homepage– what chemical reactions SHOULD look like (CML) example
What should institutions and NDLTD do to promote this vision?
- involve young people in all parts of the process – understand Web 2.0 culture and democracy. Be brave
- help promote their vision against the conservatism of institutions, learned societies and commercial interests
- promote thesis creation as a complete part of the research process. Start on day 0 with tools, encouragment. Get students from year+1 to explain the vision
- Harness the power of social computing (Google, Flickr, Wikipedia, etc.). You will have to anyway. Give credit for innovation in this area
- Co-develop semantic authoring tools, including scientific languages. Use rich clients for display.
- Promote the use of ontologies and similar resources as integral parts of the scholarly process. Insist on marked up information and entities
- Use software to validate data in theses. Give these tools to examiners.
- insist that data belongs to the scientific community. Use creative commons licenses from day 0.
and overall… Use the power of the scholarly community to show that they can communicate science far better than the absurd e-paper, unacceptable business models, and repression of innovation that is forced on us by the commercial and pseudo-commercial publishers. Destroy the pernicious pseudo-science of citation metrics. Reclaim our scholarship.