- role of librarians
- beyond the full-text
- legal and contractual stuff
Archive for the ‘publishing’ Category
APE2008 more thoughts
Wednesday, January 30th, 2008APE2008 – Heuer, CERN
Friday, January 25th, 2008APE 2008
Sunday, January 20th, 2008- What do we really know about publishing?
- Is ‘Open Access’ a never ending story?
- Will there be a battle between for-profit and non-for-profit publishing and who will be the survivors?
- Which is the best peer review system in the public interest?
- What does impact mean in times of the Internet?
- What are the plans of the European Commission for digital libraries, access and dissemination of information?
- Will libraries become university presses or repositories?
- How efficient is ‘OA’ in terms of information delivery?
- What are the full costs of information?
- Business models versus subsidies?
- What is the future role of books and reference works?
- How important are local languages?
- Which kind of search engines do we all need?
- What about non-text and multi media publications?
- Which models for bundling and pricing will be accepted?
- What makes publications so different?
- Why are some journals in a defined subject field much more successful than other journals?
- How important is the role of editors and editorial boards?
- What education and training is required?
- What skills are needed?
- Barrier-free information: do we provide sufficient access for the visually impaired?
Friday, January 18th, 2008
From Peter Suber More on the NIH OA mandate.
Many points but I pick one:
Jocelyn Kaiser, Uncle Sam’s Biomedical Archive Wants Your Papers, Science Magazine, January 18, 2008 (accessible only to subscribers). Excerpt:
If you have a grant from the U.S. National Institutes of Health (NIH), you will soon be required to take some steps to make the results public. Last week, NIH informed its grantees that, to comply with a new law, they must begin sending copies of their accepted, peer-reviewed manuscripts to NIH for posting in a free online archive. Failure to do so could delay a grant or jeopardize current research funding, NIH warns…. [...] Scientists who have been sending their papers to PMC say the process is relatively easy, but keeping track of each journal’s copyright policy is not….
PMR: Exactly. It should be trivial to find out what a journal’s policy is. As easy as reading an Open Source licence. An enormous amount of human effort is wasted – authors, repositarians, on repeatedly trying to (and often failing to) get this conceptually simple information.
I’ve been doing article and interviews on OA and Open Data recently and one thing that becomes ever clearer is that we need licences or other tools. Labeling with “open access” doesn’t work.
Science 2.0
Friday, January 18th, 2008PMR: It a reasonably balanced article, touching many of the efforts mentioned in this blog. It’s under no illusions that this won’t be easy. I’ve just finished doing an interview where at the end I was asked what we would be like in 5 years’ time and I was rather pessismistic that the current metrics-based dystopia would persist and even get worse (The UK has increased its efforts on metrics-based assessment in which case almost any innovation, almost by definition, is discouraged). But on the other hand I think the vitality pf @2.0@ in so many areas may provide unstoppable disruption.I’m way behind on this, but anyway: a while back, writer Mitch Waldrop interviewed me and a whole bunch of other people interested in (what I usually call) Open Science, for an upcoming article in Scientific American. A draft of the article is now available for reading, but even better — in a wholly subject matter appropriate twist, it’s also available for input from readers. Quoth Mitch:
Welcome to a Scientific American experiment in “networked journalism,” in which readers — you –get to collaborate with the author to give a story its final form.The article, below, is a particularly apt candidate for such an experiment: it’s my feature story on “Science 2.0,” which describes how researchers are beginning to harness wikis, blogs and other Web 2.0 technologies as a potentially transformative way of doing science. The draft article appears here, several months in advance of its print publication, and we are inviting you to comment on it. Your inputs will influence the article’s content, reporting, perhaps even its point of view.
Open Data in Science
Sunday, January 6th, 2008Open Data (OD) is an emerging term in the process of defining how scientific data may be published and re-used without price or permission barriers. Scientists generally see published data as belonging to the scientific community, but many publishers claim copyright over data and will not allow its re-use without permission. This is a major impediment to the progress of scholarship in the digital age. This article reviews the need for Open Data, shows examples of why Open Data are valuable and summarises some early initiatives in formalising the right of access to and re-use of scientific data.PMR: The article tries not to be too polemic and to review objectively the area of Open Data (in scientific scholarship), in the style that I have done for Wikipedia. The next section shows Open Data in action, both on individual articles and when aggregating large numbers (> 100,000) articles. Although the illustrations are from chemistry and crystallography the message should transcend the details. Finally I try to review the various intitiatives that have happened very recently and I would welcome comments and corrections. I think I understand the issues raised in the last month but they will take time to sink in. So, for example, the last section I describe and pay tribute to the Open Knowledge Foundation, Talis and colleagues, and Science/Creative Commons. I will blog this later but there is a now a formal apparatus for managing Open Data (unlike Open Access where the lack of this causes serious problems for science data). In summary, se now have:
- Community Norms(“this is how the community expects A and B and C to behave – the norms have no legal force but if you don’t work with them you might be ostracized, get no grants, etc.”)
- Protocols. These are high-level declarations which allow licences to be constructed. Both Science Commons and The Open Knowledge Foundation have such instruments. They describe the principles to which conformant licences must honour. I use the term meta-licence (analogous to XML, a meta-markuplanguage for creating markup languages).
- Licences. These include PDDL and CC0 which conform to the protocol.
Open Data in science is now recognised as a critically important area which needs much careful and coordinated work if it is to develop successfully. Much of this requires advocacy and it is likely that when scientists are made aware of the value of labeling their work the movement will grow rapidly. Besides the licences and buttons there are other tools which can make it easier to create Open Data (for example modifying software so that it can mark the work and also to add hash codes to protect the digital integrity). Creative Commons is well known outside Open Access and has a large following. Outside of software, it is seen by many as the default way of protecting their work while making it available in the way they wish. CC has the resources, the community respect and the commitment to continue to develop appropriate tools and strategies. But there is much more that needs to be done. Full Open Access is the simplest solution but if we have to coexist with closed full-text the problem of embedded data must be addressed, by recognising the right to extract and index data. And in any case conventional publication discourages the full publication of the scientific record. The adoption of Open Notebook Science in parallel with the formal publications of the work can do much to liberate the data. Although data quality and formats are not strictly part of Open Data, their adoption will have marked improvements. The general realisation of the value of reuse will create strong pressure for more and better data. If publishers do not gladly accept this challenge, then scientists will rapidly find other ways of publishing data, probably through institutional, departmental, national or international subject repositories. In any case the community will rapidly move to Open Data and publishers resisting this will be seen as a problem to be circumvented
Why publishers’ technology is obsolete – I
Sunday, January 6th, 2008CITATIONS Citations should be double-spaced at the end of the text, with the notes numbered sequentially without superscript format. Authors are responsible for accuracy of references in all aspects. Please verify quotations and page numbers before submitting. Superscript numerals should be placed at the end of the quotation or of the materials in which the source is mentioned. The numeral should be placed after all punctuation. SR follows the latest edition of the Chicago Manual of Style, published by the University of Chicago Press. Examples of the correct format for most often used references are the following: Article from a journal: Paul Metz, “Thirteen Steps to Avoiding Bad Luck in a Serials Cancellation Project,” Journal of Academic Librarianship 18 (May 1992): 76-82. [Note: when each issue is paged separately, include the issue number after the volume number: 18, no. 3(May 1992): 76-82. Do not abbreviate months. When citing page numbers, omit the digits that remain the same in both the beginning and ending numbers, e.g., 111-13.PMR: It’s the author who has to do all this. In a different journal it would be a different style – maybe Harvard, or Oxford or goodness knows. Each with their own bizarre, pointless micro syntax. As we know there is a simple effective way of identifying a citation in a journal – the Digital object identifier – (Wikipedia), It’s a unique identifier, managed by each publisher and there is a resolution service. OK not all back journals are in the system, and OK it doesn’t do non journal articles but why not use it for the citations it can support. IN many science disciplines almost all modern citations would have DOIs. Not only would it speed up the process but it would save errors. Authors tend to write abbreviations (J. Acad. Lib), mungle the volumes and pages, get the fields in the wrong areas. They hate it, and I suspect so do the technical editors when they have to correct the error. I can’t actually believe the authors save the technical editorsany time – I suspect it costs time. You may argue that the publisher still has to type out the citation from the DOI. Not at all. This is all in standard form. Completely automatic. Why also cannot publisher emit their bibliographic metadata in standard XML on their web page. It’s a solved problem. It would mean that anyone getting the a citation would get it right. (I assume that garbled citations don’t get counted in the holy numbers game, so it pays to have your metadata scraped correctly. And XML is the simple, correct way to do that. It’s not as if the publishers don’t have an XML Schema (or rather DTD). They do. It’s called PRISM. Honest. Publishing Requirements for Industry Standard a worthy – if probably overengineered approach. But maybe the name has got confused. Of course the NIH/Pubmed has got this problem solved. Because they are the scientific information providers of the future.
Why not borrow their solution?
Why PubMed is so important in the NIH mandate – cont.
Saturday, January 5th, 2008
Notice the range of topics offered. Many of these are searching collections of named scientific entities. Such as genes, proteins, molecules, diseases, etc. One really clever idea – at least two decades old – was that you search in one domain, come back with the hits, search in another domain, and so on. An early idea of mashups, for example.
You can’t do this with Google. If you search for CAT you get all sorts of things. But in Pubmed you can differentiate between the animal, the 3-base codon, the tripeptide, the enzyme, the gene, the scanning techique and so on. Vastly improved accuracy. You can search for CAT scans on Cats. And there are the non-textual searches. You can do homology seraches for sequences. Similar molecules using connection tables. etc. etc.
Then there is the enormous economy of scale. Let’s say I search for p450 (a liver enzyme). I get 23000+ hits. I can’t possibly read them all. But OSCAR can. OSCAR can read the abstracts anyway, but now it will be able to read many more fulltexts as well. It can pass them to chemistry engines, which pass them onto … and then onto …
You can’t do that with Institutional repositories or with self-archiving. They don’t have the domain search engines, they don’t have the comprehensives. They don’t emit the science in standard XML.
For science it is likely that we have to have domain repositories. With domain-specific search engines, XML, RDF, ORE, the lot. It’s the natural way that scientists will work.
And PubMed – and its whole information infrastructure of MeSH, PubChem, Entrez, etc. is so well constructed and run that it serves as an excellent example of where we should be aiming. It’s part of the future of scientific information and data-driven science. Do the Royal Society of Chemistry and Wiley care about my moral rights?
Saturday, January 5th, 2008Experimental data checker: better information for organic chemists
S. E. Adams, J. M. Goodman, R. J. Kidd, A. D. McNaught, P. Murray-Rust, F. R. Norton, J. A. Townsend and C. A. Waudby and you can still find it posted at: http://www.rsc.org/Publishing/Journals/OB/article.asp?doi=B411699M However when I visit the RSC page – on the RSC site – at: http://www.rsc.org/publishing/journals/OB/article.asp?DOI=B411699M&type=ForwardLink I find:
Since this is on the RSC’s own site and it says it’s not an RSC journal article it’s clearly deliberate, not a mistake. The RSC seems to have transferred the rights of the paper to Wiley, who are reselling it under the name Cheminform. Or maybe both are selling it. Or maybe the RSC don’t know what Wiley are doing. (The best I can see is that Wiley appear to be passing off my/our paper under their name. As far as I can see they are only selling the abstract and even then it;s the wrong one – but maybe they are also selling the full text if they were competent to get the web site right. And they are asking 30 USD.)
I care very deeply about this. I used to be proud to publish in the journals of the Chemical Society (now the RSC). Can I still be proud? they have disowned my article as not one of theirs. Someone reading the Wiley page would naturally assume that I had published in a Wiley Journal and not with the RSC. We’ve worked closely with the RSC – many of the ideas for Project Prospect came from our group.
A major justification for Transfer of Copyright to publishers, whether or not you believe it, is that it allows the publisher to defend the integrity of the work against copyright infringement by others. I contend that what I have depicted here is a gross violation of someone’s copyright. Probably not mine since I gave it away.
Cockup or conspiracy – I don’t know. But I certainly feel my rights have been violated. Open Data: Datument submitted to Elsevier’s Serials Review
Saturday, January 5th, 2008*Serials Review* Serials Review (v.30, no.4, 2004) was a focus issue on Open Access. It remains one of the most heavily downloaded issues and articles even now. Open Access remains a “hot topic” and fundamental discussion in scholarly communication. Your names were suggested by either current board members or previous contributors to the Open Access issue. At the time of that publication, editors and authors envisioned revisiting the Open Access environment a few years hence since issues, publisher responses, “experiments,” and government mandates were or are in flux.PMR: and (b) we are all allowed to retain copyright. [I'll discuss the message later. This post is about the medium. And how today's medium doesn't carry messages very well at all.] First to publicly thank Connie Foster for her patience. I warned her that I would not submit a conventional manuscript because I wanted to show what Scientific Data are actually like. And you can’t do that in a PDF, can you? So I asked ahead of time if I could submit HTML. It caused the publoisher (Elsevier) a lot of huffing and puffing. The answer seemed to be “yes”, but when I came to submit the manuscript it only accepted dead documents. So I’ve ended up mailing it to Connie. The document is a datument – a term that Henry Rzepa and I coined about 4 years ago (From Hypermedia to Datuments: Murray-Rust and Rzepa: JoDI). It emphasizes that information should be seamless – not arbitrarily split into “full-text” and “data” because it’s easier for twentieth century publishers. (I return to this in a later post). The ideal medium for datuments is XML – for example using ICE (Integrated Content Environment) and that’s why I’m going to visit Peter Sefton and colleagues. But the simple way to create datuments is in valid XHTML. Every editor in the world should now produce XHTML so there is no reason not to do it. It’s a standard. It’s in billions of machines over the world. It’s got everything we need. You see hundreds of examples every day. XHTML manages:
- images (it’s done this for 15 years)
- multimedia (also for 15 years)
- hyperlinks (for 15 years)
- interactive objects (also for 15 years, though with some scratchy syntax)
- foreign namespaces – probaly about 10 years
- vector graphics (SVG) nearly 10 years
