lemon8-XML and theses

Via Peter Suber. Although the full post is important for Open Access new, I concentrate on an XML tool I hadn’t heard of:
15:13 13/08/2007, Peter Suber, Open Access News
Dean Giustini, UBC’s John Willinsky – Stanford Takes Him (For Now), Open Medicine blog, August 12, 2007. Excerpt:

UBC’s Dr. John Willinsky is no stranger to open access advocates. His book The Access Principle is ‘required reading’ for all those who believe in the connection between access to information and the economic and social well-being of knowledge-based societies. Recently, John accepted an appointment at Stanford University….
[…]
As for what’s next for PKP, we will be releasing the next version of OJS, in a few months time, in association with our parallel release of Lemon8-XML, developed by MJ Suhonos, which will will automate XML conversion from Word and ODT documents.

PMR: So I looked it up:

Lemon8-XML

Lemon8-XML is a web-based service designed to make it easier to convert academic papers from typical word-processor editing formats such as MS-Word .DOC and OpenOffice .ODT, to publishing layout formats such as XML. It provides the ability to edit document metadata such as the list of authors, as well as robust citation editing, checking and correction.Lemon8-XML is a project developed by the Public Knowledge Project, as a demonstration of technology that can help significantly decrease the cost and effort of scholarly publishing. Although it is a standalone service, Lemon8 works well with journals published using Open Journal Systems.
Much of the work involved in Lemon8 has been developed from years of journal publishing experience, and continues to take advantage of the newest web-based technology as it becomes available.
We will soon be creating a mailing list for interested developers and beta-testers, along with some documentation, an FAQ, and a PKP discussion forum for Lemon8-XML.
If you’d like to be kept up-to-date on Lemon8-XML developments, please let us know.

PMR: This is very exciting for our SPECTRa-T : Submission, Preservation and Exposure of Chemistry project where we are capturing metadata from academic theses. Although the preferred method of presentation is PDF these theses are originally born-digital as Word or LaTeX. But these versions are often hidden away and not reposited. The PDF looks so wonderful, doesn’t it? Surely no-one wants that ugly Word doc? But for use it’s a 100 times better. And if the lemon8-XML can capture authors and other metadata that’s a really important advance.
Because the more structured the document is the better we can analyze it. For example it’s not a good idea to look for chemical names in author lists. (Murray-“Rust” could be indexed as Fe3O4 and PMR as proton magnetic resonance). But normal word documents just contain different paragraphs, usually no sections. Bold 12 is not obviously a chapter, author, or citation.
I couldn’t find a download button. (I am assuming that it is Open Source, given that it comes from the home of Open Journals. No logical connection, of course, but…)
NOTE ADDED LATER:
There is a forum http://pkp.sfu.ca/support/forum/ for lemon8-xml and some slides from a meeting: Lemon8-PKP-Conference.pdf
The slides have a bit more information suggesting this is an early adopter tool at present. I have written asking for more info and will post when it appears. Since they have other Open Source software on their site it should be a good bet that lemon8-xml is Open.

This entry was posted in theses. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *