JailBreaking the PDF

The Scholarly Revolution #scholrev is forging ahead. Alexander Garcia Castro is running a fantastic hackathon n Montpelier immediately after the SePublica Polemics workshop.

 

Join us in Montpellier for a one-day event to hack on scholarly PDFs!

Do you have tools that may help us to extract information from PDFs?
send us an email so that we can include them in the hackathon.

Would you like to extract citations from existing PDFs?

Wouldn’t it be cool if we, scholars, did not have to pay for citation
data? What about author disambiguation?

Are you interested in identifying and extracting meaningful parts from PDFs?

Would you like to have XML/RDF for scholarly PDFs? What if you could
have access to the actual content of the PDF for supporting the Web of
Data?

We are interested in all of these issues, send us your tools, ideas,
comments and join us in Montpellier. We are also supporting remote
participation to the hackathon -hangout and webex.

Visit us at http://scholrev.org/hackathon/

casey.mclaughlin@cci.fsu.edu
alexgarciac@gmail.com


Alexander Garcia
http://www.alexandergarcia.name/
http://www.usefilm.com/photographer/75943.html
http://www.linkedin.com/in/alexgarciac


One the important aspects of a revolution is having the right tools and this hackathon will collect what we’ve got and work out how to deploy them. “Jailbreaking” PDFs is not easy. It’s complex and it’s messy. But we are getting to the stage where we have the tools to:

  • Download PDFs from the open web.
  • Turn them into semantic form
  • Filter the semantics and repurpose them – everything from metadata to citations to chemistry to phylogenetic trees
  • Build a community

And since we work with open source everything we do is a step forward. Once we have solved a problem it can’t be unsolved (unlike commercial closed tools which are often withdrawn of locked). There’s a great deal we can do with collaborative action (each person can add a stone to the building.

All we have to do is care enough.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *