Heather Morrison shows how “locked” PDFs disadvantage the print disabled.
Via Peter Suber: Heather Morrison, Unlock the PDFs, for the print disabled (and open access, too), a posting to SOAF and other lists, November 6, 2007. Excerpt:
For the print disabled, the difference between a PDF that is locked down and one that is not, is the difference between a work that is accessible, and one that is one.
A locked PDF is an image file, with inaccessible text. An unlocked PDF has text that is accessible, that can be manipulated by screen readers designed for the print disabled. Even without special equipment, is it easy to see how an unlocked PDF can very easily be transformed into large print, or read aloud.
Publishers, please unlock your PDFs! Librarians, please ask about unlocked PDFs when you purchase.
The Budapest Open Access Initiative did not aim to meet the needs of the print disabled. This is just another side-benefit of open access.PS: Comment. Exactly. If publishers insist on using PDFs at all, then at least they should unlock them. To facilitate re-use even further, they should offer HTML or XML editions alongside the PDFs.
PMR: It isn’t just the print-disabled. After working with PDFs for several years I am now brain-disabled. And they are destroying our productivity. I’m serious. Our SPECTRa-T (Submission, Preservation and Exposure of Chemistry in Theses) project has been looking at how to extract chemistry from PDF theses. It’s worse than I ever thought. Some theses emit non-printing characters. I can’t show them in this post because they are non-printing. But they break XML files. Just one of many PDF bugs that have slowed us down.
siht ekil tuoemac eno rehtonA
It really did.
None of these theses was original written in PDF. It was written in TeX, or DOC or… It’s been turned into PDF because people think it looks nice.
It might on the surface. Underneath it is one of the most prurulent and corruptive systems ever disguised. Don’t use it if you can help it. And if you can’t help it, do what Peter suggests – accompany it with the XML or HTML. They may be not quite so nice on the surface but underneath they’re lovely.
This is a quote from a previous comment:
“It might on the surface. Underneath it is one of the most prurulent and corruptive systems ever disguised. Don’t use it if you can help it. And if you can’t help it, do what Peter suggests – accompany it with the XML or HTML. They may be not quite so nice on the surface but underneath they’re lovely.”
It is a horribly sad commentary not on PDF but on the ignorance that is so rampant in the document management industry. We will ignore the misspellings and just speak to the ridiculous statement itself. There is nothing corrupt about PDF. In fact when created correctly it is one of the most well ordered and informative formats available today. The reason poor uninformed people like this sad sad man experience such problems is quite simple. The PDF document was created using cheap substandard software and most likely this gentleman when trying to extract information from the PDF is also using some cheap sweatshop software built by self taught hackers in the back room of a Moscow ghetto.
If when creating a PDF the user will actually take the time to create it correctly and use quality applications to do so and when the next user attempts to extract data from the PDF they also use quality software and business practices then there will be no problems. There is a reason that the rest of the world has made PDF the industry and governmental standard document format and it is not because it is such a horrible choice. No, on the contrary, it is because it is a well designed, ordered, and open format that when used correctly creates not only an attractive but informative document.
(1) Thanks – I agree that I have taken a stand against PDFs. I think we will agree to differ. In practice in our area it is extremely difficult to extract useful information from them in a reliable manner and to use them to communicate scientific data.