I realised with considerable disappointment ( Can I data- and Text-mine Pubmed Central?) that I might not be able to text- and data-mine the material that the NIH has required to be deposited in Pubmed Central in its mandate. Now I have got confirmation by email from an authoritative source (who asks not to be named in case the information is not quite precise). But in general terms the answer is simple:
NO-ONE MAY DATA- OR TEXT-MINE PUBMED CENTRAL
In short Pubmed Central is "free access" (no price barriers), not "open access" (no permission barriers). You may not download material from it (except to expose it to your own eyeballs), and certainly not redistribute it. You may not data-mine it.
I am aware of the struggle that was required to get George Bush to sign the mandate and it certainly wasn't the time to break ranks. But now that the mandate is passed (and starts tomorrow) we must press ahead immediately to campaign for full access to the text.
We have the right and the duty to submit our views to NIH. For example Stevan Harnad has argued (recommendations to the NIH) that it is better to reposit in institutional repositories ("green"). Whether or not this is a good idea (and I personally don't think so as it make datamining almost impossible) it is clearly outside the current approach from Pubmed Central. For example, I gather, the mirrors of PMC have to agree to the same absolute permission barriers that PMC imposes - it would be impossible to ensure that thousands of libraries enforced this - almost draconian - contractual system.
So we have to argue to the NIH that bioscience is desperately impoverished by the unreasonable permission barriers that are now in place. I'm not a (US) politician and I think the NIH and advocates have done well to win the first battle. But at present the policy is seriously hindering modern science.
So the whole area is incredibly complex. The goal is simple - use scientific publications to further our understanding of science and - hopefully - make progress in enhancing human health. For this we MUST have robots. We cannot do it with humans alone - every week we get thousands of new papers.
I'd be grateful to know what the position is with Wellcome. I thought they had removed permission barriers.