Scraped/typed into Arcturus
We are on the last lap of preparation for the Green Chain Reaction. We have now built a server to receive the results – the address is in the Etherpad (http://okfnpad.org/openPatents )
We are uploading the weekly indexes of all European Patents to this site. Each index contains several thousand patents and is about 1 MByte. There will be about 1500 such weekly indexes.
How does it work?
- Each volunteer downloads the patentAnalysis code and verifies they can run it
-
They then take a weekly index and run the code against it. This code
- Selects the chemical patents (by EPO code) – 10-100 per week
- Extracts the text for the experimental sections
- Parses the solvents from them
- Aggregates the result for each patent (dissolveTotal.html)
- Uploads this to the GreenChain Server (code being written)
- Selects the chemical patents (by EPO code) – 10-100 per week
- Repeat. It takes about 30-60 mins for 1 week. You could get through 10 a day just watching the Test Match (or the rain)
We shall then trawl the results from the server and present them at the meeting. Since the data are all Open this is truly Open Notebook work. Anyone with a different approach to analysis is welcome to use the data.
Indexes for 1980-1900 are now loaded, and the rest should be done in an hour.
Many thanks to our volunteers and to Sam Adams for creating the Green Chain Server system and code to access it.