petermr's blog

A Scientist and the Web

 

Archive for February, 2008

Robert Massie on OA and PMR

Wednesday, February 27th, 2008

From Peter Suber’s Open Access Blog:

Robert Massie on OA

15:39 26/02/2008, Peter Suber, Open Access News

InfoInnovation has blogged some notes on Robert Massie’s talk at the NFAIS Annual Conference (Philadelphia, February 24-26, 2008). Massie is the president of the American Chemical Society’s Chemical Abstracts Service (CAS). Excerpt:

…It turns out that from its beginnings in the 19th century until 1966, CAS’ abstracts were written by volunteer abstractors – a robust early example of user-generated content. True, Massie noted, today new standards for chemical information exchange are developing; open access repositories are growing; collaborative websites are emerging; and political/social pressures for more free access characterize the age. “But do [these trends] have to be opposed? Or assimilated?” Massie noted in particular an article that appeared this month in Nature – “Chemistry for Everyone.” In it, noted research Peter Murray-rust argues that CAS is “incompatible with the requirements of Web 2.0”; that “closed publications, binary software and toll-access databases are being swept away by emerging philosophies and approaches.” But, Massie noted, universities are the Web 2.0 homeland, and SciFinder Scholar now serves over 1500 schools. Not only that – many sites in China have sprung up to provide information on how to break into the computer systems of major US universities in order to gain access to SciFinder. So, clearly, “young people in China like SciFinder a lot.”

Massie asserted that the question of Web 2.0 vs. traditional publications is “not a binary problem.” …

Comment PeterS). I wish I had access to the full talk in order to see two parts in full context. First, what did Massie mean by asking whether the trends toward OA (or Web 2.0?) “have to be opposed? Or assimilated?” It sounds like he thinks opposition is unnecessary and unwise. But does assimilation mean adoption? Second, I’d like to see whether he went beyond a narrow response to Peter Murray-Rust’s claim that the new models were sweeping away the old, and offered a wider response to his argument that the new models were superior.

PMR: Like PeterS I only have this snippet to comment on – perhaps Robert can make his slides available? A few points:

  • SciFinder (Scholar) is a good and valuable product. It is de rigeur in chemistry departments. However it is also expensive and many institutions cannot afford it. (I believe that some countries manage a national deal).
  • The information cannot be re-used (it is protected by copyright). This prevents mashups, compilation of secondary resources, etc. It cannot be linked to in a Web 2.0 manner, tagged, etc.

I am prepared to believe the assertion about China. There is a hunger for scholarship. I would also assert that “young people in China like Pubmed a lot” is true.

I will not comment on the ethics or politics of the alleged Chinese actions. However it seems clear that, for whatever reason, scientific information is becoming a battleground. I have already suffered from getting the University cut off by the ACS publications server (for actions that were entirely legal and where the server behaved IMO in an automatic and inappropariate manner – it thought I was stealing info – I wasn’t).

There is clearly a cost to the closed publishing community in trying to protect its content. Whatever the rights and wrongs of copyrighting scientific raw data, it is clear that the content in Chemical Abstracts is won by the sweat of many brows and is copyrightable. If the Chinese students are trying to get this by hacking into subscribers rather than providers there is a threat to academic systems in general. I’m guessing, but I would assume there will be an increasing pressure in contracts for the subscriber to have to provide mechanisms to prevent misuse of the subscribed information. This, of course, goes beyond SciFinder and may have to be seen as a major concern of academia. Do we have to police our information sources in the same way as we police access to airplanes?

But where the information is free these arguments vanish. I’m not arguing for the complete abolition of copyright but there is increasingly little value for it in the promotion of scientific activity. That is why I have urged publishers to prepare for Open Access (and Open Data) as it seems inevitable. The costs (financial and social) on controlling access to what increasing number of scientists regard as Open information will become unacceptable.

Australian update

Tuesday, February 26th, 2008

Very brief. Had an excellent time in South Australia hosted by Philip Lock. Presented to Computer Science Department at UNISA and developed more ideas in “can machines understand chemistry”. A lot of synergy in discussion over XML, RDF, etc. Then a brief visit to the University of Adelaide to talk about eResearch. We spent time talking about what would be useful for Departmental Respositories (sic). This has helped my firm up my ideas well and I hope to post some detailed ideas soon. Shall be giving  a talk next Monday (2008-03-03) at Monash on scientific data. Have been overwhelmed by hospitality and have found a lot of commonality in our discussions.

Adelaide Travel Update

Sunday, February 24th, 2008

A brief update. We’ve been magnificently entertained by Phil Lock in his beach house south of Adelaide (I’m still getting to grips with the local directions in Australian cities). Also a tour of Adelaide – highlight is the statue of Sir Donald Bradman.  I’m giving a talk tomorrow (Monday) at the University of South Australia at their Mawson Lakes campus – full of bright new coloured buildings. This is to the Computer Scientists so I shall revisit my theme of “Can Machines Understand Chemistry” .

More later.

Travel update

Friday, February 22nd, 2008

So much is happeing that I have little time to blog but I HAVE to say thank you to so many people. First Alison Edwards and Graham Heath for several days’ hospitality. To their colleagues at ANU and ANSTO. Many valuable discussions focussing on semantic documents and data capture/re-use (repositories). A chance to see the neutron equipment (WOMBAT, PLATYPUS, ECHIDNA – you get the theme) and to talk about the software (GUMTREE) that oversees it. There is the normal difficult choice – do you go for something that is generic and does everything – or construct code on a per-instrument or per-project basis? Both have merits and demerits – I am biased towards lightweight Web 2.0-like solutions but those don’t fit all occasions.

Then to Sydney and the hospitality of Peter Turner and colleagues. Peter oversees the crystallographic work and also the MMSN — Molecular and Materials Structure Network

The MMSN links scientists, technicians and students engaged in the determination and analysis of atomic structures of any kind; biological molecules, chemical molecules or solid state materials – and unites them with Grid computing, visualisation, database, informatics and applied mathematics researchers. The network has the following key goals:

  • Establish remote access for a network of structure determination and analysis instruments, ultimately providing the basis for developing a Grid enabled network linkable to other instrument, data and computation grids; national and international. The instruments may be at ‘conventional laboratories’ or at the new major facilities currently under construction; the Replacement Research Reactor and the Australian Synchrotron.
  • Establish a Grid enabled e-Science network for the interactive visualisation and analysis of a diverse collection of structural databases on multiple geographically distributed display devices.
  • Hold regular informational/educational meetings and workshops, hosting international experts, for structural science practitioners and students. Cross-disciplinary fertilisation and collaboration will be stimulated though techniques meetings and workshops, and summer schools will ensure that young scientists are trained at the leading edge of structure determination and analysis techniques.
  • Build direct links into the secondary school system to establish understanding and interest in the molecular and materials structure sciences.

PMR:  There’s too much to put here but I now have a much clearer idea of crystallography and related subjects in Australia. Peter is also involved in the eCrystals network run by Southampton (UK) by Simon Coles and in which we are a member. These (human) netowrks are very important in amplifying the values of eScience/eResearch tools and ideas. I’ve found lots of people here that I can share with in both directions and we are hoping that we can have some visitors in Cambridge.

I gave a seminar in Chemistry/Library today with a mix of Chemistry/Library/IT  and this gave me another opportunity to expand on long-tailed science (Jim Downing’s term – see Big Science and Long-tail Science). I’ll post some slides soon. I think it helps clarify why nuclear reactors need one approach and department crystallography another.

Then Peter and Mat Todd (a synthetic chemist whom I met through blogs) took me to lunch on the Sydney waterfront – wow! There’s lots of scope for collaboration and Mat may be able to help with our thesis work.

We’re off to meet Phil Lock in Adelaide tomorrow and talk on Monday – probably the same sub-themes – semantics, repositories, data, etc.

Then back to Melbourne on Tuesday and a day or two to collect our senses.

Travel update

Thursday, February 21st, 2008

I’m now in ANSTO – Australia’s (deliberately) lone reactor, hosted by Alison Edwards. Alison and Graham Heath have looked after me and Judith fantastically over the last few days, at Wollongong and elsewhere. I’m talking on “The Semantic Web abd Physical Science” and this is a great chance to try out ideas about the differences between “big science” and “longtailed science” (Jim Downing’s phrase).

Peter Sefton has blogged about the details of the ICE work in Toowoomba (Cyclone Peter Murray-Rust moves away from Toowoomba. Cleanup continues.)

Last week we hosted Peter Murray-Rust in Toowoomba. The ICE team have been busy getting ready for some other visitors, so I have not had time to write about the visit. Peter has blogged about his stay a few times, describing it as intensive talk and hacking. Intensive indeed.

PMR: We are now committed to trying out ICE for blogging (Using Ron’s adaptor) and moving to theses. More later.

Off to Sydney tonight. If anyone wants to get in touch for the p.m. please email or comment on the blog

Travels update

Monday, February 18th, 2008

Yesterday we were shown round Canberra including the Botanic Gardens with a splendid Eucalypt garden – the species vary enormously in texture, smooth, rough, shaggy, etc. and the splendid Scribbly gum whose scribbles are made by the Scribbly Gum Moth.

Today an invited talk at Australian National University. When preparing I researched Artificial intelligence (WP) and found:

Artificial intelligence can also be evaluated on specific problems such as small problems in chemistry, hand-writing recognition and game-playing. Such tests have been termed subject matter expert Turing tests. Smaller problems provide more achievable goals and there are an ever-increasing number of positive results.

… and …

 A subject matter expert Turing test is a variation of the Turing test where a computer system attempts to replicate an expert in a given field such as chemistry or marketing. This concept was described by Ray Kurzweil in his 2005 book The Singularity is Near, and is predicted as a consequence of Moore’s Law.

The irony is that the work in “chemistry” relates to the early – and brilliant – work done  in the 1970′s in Harvard, Stanford and elsewhere. Chemists have abandoned – or actively prevented by Closed data – the further development of the methods. Nevertheless the Semantic Web is creeping up on all sides so the techniques required are abundant and cheap.

So for my talk I felt emboldened to ask the question “Can a machine understand chemistry?” (my answer is ‘yes’ – give or take a year or two and the availability of Open Data).

In the afternoon I visited Geoscience Australia (the national geological resource) who are interested in merging GML and GIS systems with CML for chemical composition and geochronology through isotopic measurements. We made great progress in 2-3 hours and I’ll be raising some questions about isotopes on the CML Blog.

Strine travel update – future

Sunday, February 17th, 2008

This week Judith M-R and I will be visiting and giving talks. Judith’s is on structural biology, mine on the semantic web and physical science (which has the overt agenda of also neeting multidisciplinary scientists and seeing whether there is scope for collaborative eResearch infrastructure – or even science). In ANU I shall be talking with geoscientists about how Chemical Markup Language can help support isotopes in geosciences.

Strine travel update – past

Sunday, February 17th, 2008

We have had a wonderful time and been looked after fantastically by our Australian (Strine) hosts. (The pronunciation matters – when we arrived in Melbourne we needed to go to Prahran – a very lively and well-known suburb. Our English disyllabic vocalisation was completely opaque – all vowels are elided and omitted so it’s something like “Praaan”, spelt Pran.).

Past week:

I’ve been to Monash (downtown), been hosted by Ashley Buckle (protein crystallographer) on the edges of the National Park where we talked about how to capture image data. I’m going back to Monash later so will blog more then..

Then I went to Toowoomba to University of Southern Queensland where we had nearly three days intensive talk and hacking with Peter Sefton and colleagues. I’ve had the chance to look closely at ICE: The Integrated Content Environment- an authoring environment for academic material. USQ eat their own dog-food and over 100 academic staff at USQ use it routinely for authoring their course material. USQ is very committed to high-quality distance education – they have an impressive enrollment from overseas and they put a lot of work into the material which supports it. So their material can be repurposed as notes, lecturer’s copies, slides, summaries, etc. All this is managed through stylesheets – which are the key to ICE. The content is written once but delivered in many ways. Because the material is in XML it is also possible to amend it with XML-aware tools or to generate new material programmatically. A key aspect is that the structure of the document(s) can be managed in XML. So I am now convinced that for academic work it is (a) fit-for-purpose (b) reconfigurable (c) powerful. It’s still “early-adopter” for theses, but as it can do so many new things I can’t see any real competition. And Peter showed us how to use it for blogs, so I’m going to integrate it into this process for XML and chemistry. (It won’t stop the typos or the rants).

Then to Brisbane and Margaret Henty’s APSR meeting. A very big public thank-you to Margaret for this – a very valuable meeting (though apparently the last). And there’s a very full record from Peta Hopkins: Open Access Collections

On Library Lovers’ Day I attended Open Access Collections at Customs House in Brisbane. This was an APSR event held in association with QULOC and the University of Queensland. Customs House, Brisbane

[...PMR - and this is the Customs House from Peta's blog...]

PMR: To reiterate I appreciated the presence and presentations from government – it seems closer to the process than in UK which is insulated by layers of research councils, etc. (I’d heard Rhys Francis talk about eResearch last year).

The main concern is a generic one which I’ve found everywhere. The libraries and the scientists don’t interact. There was a frank appraisal of why the “build it and they will come” doesn’t work. And nor does the “force them to use it”. I don’t know how to solve this problem but it is urgent and has to be addressed. My strategy is to suggest that libraries should find those disciplines which combine need, clarity of metadata, and – presumably – political weight within the institution. Embed a library person IN the department. Give them a white coat. Find a problem where they could make a contribution which would be recognised in joint authoriship of a scientific paper. Then promote this model wider in the university and the discipline. You won’t be able to solve all the disciplines at one go. So I now have a heavy and exciting program of seeing how this can be done in chemistry and related disciplines. The following won’t do justice to each institution…

Ray Frost from Queensland University of Technology (QUT) spoke at the APSR meeting and invited me to visit the next day. Ray is a very prolific chemist/mineralogist/spectroscopist so this is mainstream territory for me. QUT made an outstanding contribution to Open Access, requiring that all material be Open. Period. When collaborators want Ray’s papers, he simply give them a link to the repository. This is an excellent model – a repository for the individual and for the group – that engages the scientist. So I talked to Ray and colleagues the next day and floated the idea of a SPECTRa-like departmental repository. I was delighted by the immediate understanding and positive response. I think that n attractive model is for the Library/LIS to sponsor the installation of such a system to the extent where the department can take on the running costs. These are not as fearsome as they sound – like us QUT have an Active Directory system and this is a good start to having an easily accessible repository. Maintence is mainly configuration (linking to other university resources, updating URLs, etc.) and as such is probably of the same order of supporting a computational chemistry program.

Then by boat to University of Queensland and Jane Hunter’s group. This is immediately driven by ORE (see OREChem in Jim Downing ..) where Jane and Kwok Cheung have been the early adopter. Kwok took me through the system – which deals with Xray diffraction, so again on common ground. The system has a drag and drop approach to creating named graphs and can write out in a number of formats, including directly into Fedora. This will be very useful for OREChem where we are creating molecular repositories and some may use the traditional tools (although Jim and I are working on specifically molecular repositories).

And now we are with Alison Edwards (ANSTO) and Graham Heath (ANU) – old colleague of mine in Scotland. Thought I would take today off, but as I have 3 talks next week, will have to spend some of it working on them…

Anyway thanks again for a really great time. There is a great deal of synergy in the “eScience/eResearch” area and I am hoping we can set up exchange visits, workshops, etc. This is an area where competition makes no sense and we have a lot of commonality and complementarity.

Open Definition Advisory Council

Saturday, February 16th, 2008

I’m delighted to report the launch of the Open Definition Advisory Council. When Rufus  Pollock first bent my ear on Open Knowledge (perhaps 3+ years ago) I didn’t really understand it. Now its role is absolutely clear – we need a meta-licence and philisophy to cover a range of fields for which formal licences ot practices can be put in place.  Rufus and colleagues have spent huge amounts of time formalising the definition and their skill is that it applies to a wide range of fields without being too diffuse to be meaningful. In science, for example, it comes at the right time to interact with and complement Science Commons….

 

Open Definition Advisory Council launched

16:30 15/02/2008, Jonathan Gray, news, okf, okf projects, open data, open knowledge, open knowledge definition, open service, Open Knowledge Foundation Weblog

We are pleased to announce the launch of an Advisory Council for opendefinition.org. The Council will be formally responsible for maintaining and developing the Definitions and associated material found on the Open Definition site – including the Open Knowledge Definition and the Open Service Definition. As many of you will know, these definitions aim to provide clear and succinct sets of conditions for ‘openness’ in knowledge and services.

Jordan Hatcher of opencontentlawyer.com has kindly agreed to be Chair of the Council, which includes:

  • Paul Jacobson, iCommons
  • Paul Miller, Talis
  • Peter Murray-Rust, Cambridge University
  • Rufus Pollock, Open Knowledge Foundation & Cambridge University
  • Rob Styles, Talis
  • Peter Suber, Scholarly Publishing and Academic Resources Coalition (SPARC) & Earlham College
  • Luis Villa, Columbia Law School, GNOME Foundation & Open Source Initiative
  • Jo Walsh, Open Knowledge Foundation & Open Source Geo-Spatial Foundation
  • John Wilbanks, Science Commons

More detailed biographies are available on the Advisory Council page.

It is our intention that the overall development of the material on the site will continue in the same community based and collaborative manner. The Council’s role will be to provide oversight, guidance and input into this process, not to replace it.

This is fantastic news for the definitions projects!

APSR 2008

Thursday, February 14th, 2008

Brief update as I haven’t been able to log in…

Very good meeting today at Brisbane – in the Customs House which belongs to the University of Brisbane, Queensland [Sorry! Got it right lower :-) ] overlooks the river. On APSR home page (Australian Partnership for Sustainable Repositories) – run by Margaret Henty.

This is blogged after the eve nt so only impressions. (Gentle niggle again – please can we have internet access – certainly for speakers. So I didn’t blog as I went…) .

Australia has really got its act together on the e-infrastructure. It’s created repositories (not all full) and now the Australian National Data Service (couldn’t find a home page). It got pulled off track last year by trying to make the repositories an instrument of research management for the RQF (Australia’s answer the the UK Research Assessment Exercise).

Moral: If you try to use the repositories for things other than helping authors, then you will lose the authors.

This was a strong theme. If you don’t address what people need to do, then no technology or investment will save you. Australia is pragmatic and learns quickly. It means it can avoid the mistakes of putting lots of investment into cutting edge computer science and then wondering why no scientists go near it. Australia understands the need to concentrate on data. Have we learnt this in the UK? Not sure we have.

Off to Queensland University of Technology tomorrow – a shining light as early adopter of mandating Open Access deposition. To see if we have synergy in capturing chemistry data. And then the University of Queensland to talk about ORE.

More later

UPDATE… I can’t write a new post (cannot authenticate for some reason).  Just a few words to say I had a wonderful day first at QUT with Ray Frost (chemistry). Ray is a very prolific chemist – ca 1 paper /week – and we had very fruitful discussions with him and colleagues about how SPECTRa can help to capture data.

Then on the UQ by ferry to visit Jane Hunter and colleagues about ORE and ther impressively wid range of informatics projects.

Have to get up early yet again – this time to Canberra. This trip is exciting but hard work.

Hopefully more later.