#acsanaheim #opendata #crystaleye
I’ve more-or-less put my thoughts together for the session on Open Data. It seems to me that the key question is whether the price we pay for traditional closed data is worth it. Not just the monetary cost, but the opportunity cost – particularly in access by everyone and re-use. I’ve created a list of issues which I’d like you to think about – I have tried to be fair. If you feel strongly, please edit the Etherpad:
Overview
VERY SORRY!! I HAVE TO LEAVE AT END OF TALK AS I AM TALKING IN ANOTHER SESSION
Web-based science relies on Linked Open Data.
Topics
- Almost no scientific data is effectively published
- “Almost Open”, “Freely Accessible” is not good enough
- Open Knowledge Foundation – defines Open and DOES THINGS
-
Individuals and small groups can change the world
- Wikipedia
- OpenStreetMap – The Ordnance survey generates 100 M GBP per year but open maps bring 500 M to the economy
- What Do They Know? (Web democracy through FOI)
- Quixote – reclaiming computational chemistry
- Current publishing models are asymmetric; the author and reader have few rights or influence
- Software as an agent of political change
- Web democracy – cf Wikipedia
- Bottom-up Web 2.0 (The Blue Obelisk and Quixote)
- Text and data mining
- Panton Principles
- Near-zero cost of robots – crystalEye
- eTheses
Resources
- “Open Data” on Wikipedia
- “Open Data in Science” (Murray-Rust on Nature Precedings (http://precedings.nature.com)
- Science Commons
-
Open Knowledge Foundation
Recent Blogs
- /pmr/2011/03/28/open-data-what-i-shall-say-at-acs
- /pmr/2011/03/28/draft-panton-paper-on-textmining/
-
Some fallacies:
- “You can have SOME of the data (ACS make 8000 CAS numbers freely available to Wikipedia)
- The data are free for NON-COMMERCIAL use (see my /pmr/2010/12/17/why-i-and-you-should-avoid-nc-licences/
-
“You can always ask permission and we’ll grant it”; PMR: doesn’t scale, doesn’t persist, can’t re-use
The key question: Is the price of closed data worth it?. Do the benefits outweight the disadvantages?: to help you:
issue |
closed data |
open data |
sustainability |
supported by income |
few proven models |
creation of business model |
easyish |
hard |
added human value |
often common |
possible |
support |
usually good |
depends on community |
acceptability |
well proven |
often suspicious |
cost |
high; increasing? |
marginal |
innovation |
central authority |
fully open |
reuse |
normally NO |
fully OPEN |
speed from source |
often slow |
immediate |
mashupability/LODD |
very rare |
almost universal |
reaction to new tech. |
often slow |
very fast |
comprehensivenes |
often patchy |
potentially v. high |
global availability |
often very poor |
universal |
I have started an Etherpad at http://okfnpad.org/openClosedData. Please feel free to contribute
What about data quality and timeliness? I think these are important factors though I don’t know what to write into the other columns. Closed data isn’t necessarily good quality as is open data. It depends on the community/the provider.
But if there’s a data monopoly you may even pay much money for poor data (which happens).