#ami2 Farewell talk at CSIRO

Posted on February 17, 2013 by pm286

I’m (http://en.wikipedia.org/wiki/Peter_Murray-Rust ) sadly leaving CSIRO (http://www.csiro.au/) and http://en.wikipedia.org/wiki/Australia on Wed (but I plan to be back). This post is very brief and illustrates how far the semantic web has come in some areas. Here I have included

PLACES
DATES
PEOPLE
SPECIES
MATERIALS

On 2013-02-16 Nico Adams took me out to the http://en.wikipedia.org/wiki/Sherbrooke_Forest

At http://toolserver.org/~geohack/geohack.php?pagename=Sherbrooke_Forest&params=37_53_43_S_145_21_47_E_region:AU-VIC_type:landmark

and at O’Donohue Picnic Ground I saw http://en.wikipedia.org/wiki/Crimson_Rosella (Platycercus elegans)

(images from Wikipedia)

I’ll post my grotty phone photo later…

For my links, see /pmr/2013/02/03/topics-and-links-for-my-talk-on-semantic-web-for-materials/

I may mention http://rd-alliance.org/ (ANDS is a member)

I hope to show a movie of http://en.wikipedia.org/wiki/Perovskite with structure http://en.wikipedia.org/wiki/Perovskite_%28structure%29 and possibly http://en.wikipedia.org/wiki/Calcium_titanate

Posted in Uncategorized | Leave a comment

Martin Hall, VC Salford, justifies the RCUK policy and his insistence on CC-BY

Posted on February 14, 2013 by pm286

[Coped from http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal . PMR: I find this very clearly argued and very compelling]

The Finch Report says a good deal more about Green, and repositories, than your representation suggests but, of course, anyone can judge for themselves by looking at the report itself, and particularly Chapters 8 and 9. But yes, I do disagree with your view that mandated “green deposit” will achieve a situation in which all publication costs are met upfront, allowing all research results to be freely available on the principle that knowledge should be a “nonrivalrous good”. The problem with your position is that it preserves, and will prolong, a status quo in which repositories contain combinations of metadata, a limited number of full-text deposits, non-searchable PDFs and (some) published-version copies which have been migrated from publishers’ web sites. This approach also encourages for-profit partnerships between publishers and some academic societies who want to preserve high margins in order to fund other activities (of course, not all learned societies take this view – witness the excellent position being taken by the Royal Society). For example, in her presentation to the symposium organized by the Academy of Social Sciences in December, Felice Levine was quite clear that the American Educational Research Association would lobby against a system of up-front APCs because it would damage the ability of the AERA to make sufficient margins to fund their business activities. While I have nothing against the AERA and similar organizations running as businesses, their use of licencing is, in effect, a hypothecated tax on the distribution of knowledge. This seems to me inconsistent with the principle of knowledge as a nonrivalrous good.

My argument (which is not the same as the Finch Group’s position, and would probably not be shared by some on the Finch Group) is that we need to steer towards conditions in which the copy-of-record is freely and openly available because full APCs have been met upfront. This does not necessarily mean that APCs need be high and, across the full range of Open Access journals, they are already negligible. I believe that competition between for-profit publishers for “gold” will drive down APCs as long as cartels are avoided (but of course this has to be an assumption). “Double dipping” through hybrid approaches remains a risk but – as the reaction against Elsevier’s profit margins showed – academics can walk away from journals that maintain unacceptable combinations of licencing and gold options; without academic authors, publishers are emperors without clothes. In my view, the true digital revolution is one of volume and interoperability; automated data and text mining of freely searchable copy-of-record text will be essential to the “semantic intelligence” that will be the research paradigm of the near future. Institutional repositories will play a key role as digital archives. I set this argument out in my presentation to the Westminster Forum (www.salford.ac.uk/vc).

What I think is becoming clear, post-Finch and particularly in the debate about RCUK policies (which are not the same as the Finch recommendations) is that one approach does not suit all genres of research. There needs to be more work on models for the Arts, Humanities and Social Sciences, including considerations of variants to CC-BY licences (which, I’ve realized, cannot be met by using CC-BY-NC), and approaches to monograph publications. The current debates are valuable in this respect.

Martin

Martin Hall

Vice-Chancellor | Office of the Vice-Chancellor and Registrar

The Old Fire Station, The Crescent, University of Salford, Salford M5 4WT, United Kingdom

t:
+44 (0) 161 295 5050

martin.hall@salford.ac.uk |
www.salford.ac.uk

www.salford.ac.uk/vc

http://www.corporate.salford.ac.uk/leadership-management/martin-hall/blog/

Posted in Uncategorized | Leave a comment

Content-mining: #ami2 and #animalgarden continue to parse scientific PDFs into semantic form

Posted on February 14, 2013 by pm286

PMR has been hacking bugs with MJ and AMI2…

PMR: PDF2SVG should now manage Type0 fonts and we’ve fixed bugs on some character processing. Now let’s look at the Dingbats… AMI, do you have a Dingbat lookup-table?

A: No

P: OK, we’ll have to create one. I’ve tried to find a conversion table to Unicode on the web… but failed. However we’ve found:

(that’s only part because the whole picture might be copyright).

A: so what’s the Unicode for (char)51?

P: It’s a “tick”. I’ll have to look it up in the Unicode

A: I can’t understand glyphs yet. You’ll have to do it for me. There’s about 170. Is that “boring”?

P: VERY. But I will only do the ones I need and hope others will help out. It only has to be done once. I’ve found:

They don’t match up so I’ll have to do them one-by-one… AMI, please would you create a new table “dingbats.xml” in

pdf2svg1/src/main/resources/org/xmlcml/pdf2svg/codepoints/misc/dingbats.xml

A: done.

P: we’ll map char-51 to U+2713 because they seem to be the same (and char-52 to U+2714, etc.. And while I’m watching the cricket I can do some more. We have now g confidence in converting 99.99+% of the characters that are likely to occur in bioscience. There’s more work to do for maths.

A: Yes. I have a document which fails with CambriaMath. We default to Unicode so CambriaMath-4666 defaults to ETHIOPIC SYLLABLE SHI (U+123A) which looks like:

Is that what you want?

P: No I have to translate each character into the table

A: There are only 10000 so that should take a few milliseconds…

P: SHI

Posted in Uncategorized | 1 Comment

Content-mining : #animalgarden and #ami2 read Dino’s PeerJ article; is it technically OK?

Posted on February 13, 2013 by pm286

#animalgarden is mining the content in Open Access articles. [They don’t know what “Open Access” means [PMR: nor do I] so they are using CC-BY ]. They’re using Mike (Dino) Taylor’s papers as it’s about dinosaurs, giraffes and okapis (Chuff: Yeah!). They’ve agreed that the paper is legally minable. Now they are looking to see whether it’stechnically minable (whether the typography is tractable). #animalgarden has found some papers “very hard work” because of the typesetting.

AMI2: I’ve managed to read and translate (to SVG) the paper without errors

Chuff: how long did it take?

A: about 30 seconds for 40 pages. The images took the most time. There were no errors.

C: so the result is correct?

A: We can’t say that. All we know is that PDF read all the characters without throwing exceptions of LOGging errors. We know that for all the known fonts we can automatically convert the characters and for the rest we assume Unicode.

C: How do we check?

A: A human has to read the PDF and compare it to the extracted SVG. If s/he finds visual differences then the characters have been wrongly converted.

C: That’s OK. PMR can do it in 30 seconds.

A: No. PMR’s a human and humans are about 0.001 times as fast as me. So it could take hours to compare the paper.

C: And they get bored and make mistakes. PMR’s always making typos.

A: what does “bored” mean?

C: It means they get slower and slower, start making mistakes, wander off, drink beer, watch cricket, and stop altogether. Maybe we can get PMR to do just a few pages. It’s called “annotating” and “creating a gold standard”.

PMR: I have read the start of the PDF

P: and I have also read AMI’s output

P: They seem to be identical

A: Those are all ANSI characters. They are likely to be correct for most fonts. Check some characters above codepoint 127. Here’s Table 1:

A: What page is it on?

C: I don’t know. There are no page numbers.

PMR: Yes there are. The pages says “2/41”

A: Is that a convention I should know?

P: Yes. But I think only PeerJ does it.

A: So it is more work for me.

P: Yes. Every publisher does it differently.

C: Doesn’t that confuse people

P: Yes. The publishers like to be different from each other. There is no technical reason. It serves no scientific purpose and make things harder and more fragile.

A: How many page syntaxes do I have to learn?

P: Several hundred at least.

A: That will take a long time.

P. And it’s boring. Anyway, what did you find, AMI2?

A: I get the following:

PMR: that looks identical! What are the Unicode characters > 127?

I have looked up 8805 (decimal) in http://fileformat.info . It has all the Unicode codepoints with typical glyphs

PMR: The AMI2-pdf2svg distrib contains over 1000 of the commonest Unicode characters and U+2265 is included.

[long pause]

P: I have been reading the paper and the table of page 30 looks strange:

P: Now I have checked with the PDF and it’s different:

P: What’s happened?

A: Those ticks are Dingbats.

C: I don’t understand. A dingbat is a stupid person.

P: It can be (http://en.wikipedia.org/wiki/Dingbat_%28disambiguation%29 ) But Wikipedia also says (http://en.wikipedia.org/wiki/Dingbat ) …

A dingbat is an ornament, character or spacer used in typesetting, sometimes more formally known as a printer’s ornament or printer’s character.^{[citation needed]} The term continues to be used in the computer industry to describe fonts that have symbols and shapes in the positions designated for alphabetical or numeric characters.^{[citation needed]}

A: Do I have to know about Dingbats?

P: Yes. http://en.wikipedia.org/wiki/Zapf_Dingbats are one of the “Standard Type 1 Fonts (Standard 14 Fonts)”:

A: So I pick Dingbat-3. Is that right?

PMR: I don’t know. It seems murky… let’s close this blog and resume later

Posted in Uncategorized | 7 Comments

#animalgarden review PeerJ articles – 1

Posted on February 13, 2013 by pm286

#animalgarden are excited (AMI2, Sleepless and Chuff) are meeting in Melbourne. Chuff the @okfn_okapi has told them that people are interested in biodiversity. Chuff says that’s about animals and plants. PMR tells them that’s there a new journal, PeerJ, which is Open – free as in speech. PMR thinks it’s a Good Thing. He’s asked them to review it and say what they think. This is the first part (Open) – typesetting is the second.

Chuff is the OKF Okapi and is interested in Openness. AMI2 is a kangaroo who can interpret papers to a machine. For content-mining. AMI2 doesn’t understand humans and has no emotions. Chuff will have to explain things.

C: This article has both HTML and PDF versions. The PDF says

C: so I can tell it is Open-Access

A: Can I tell it’s Open Access?

C: Can you read the words?

A: There are no words in a PDF. Only characters.

C: Can you guess the words

A: I can read:

A: The y-coordinates mean they are on a single line. This gives two words “OPEN” “ACCESS”. Is that OK?

C: Possibly. Can you find anything with “CC”?

A: I have found:

A: I have guessed the spaces and this gives the words “Creative” “Commons” “CC-BY” and “3.0”. Is that OK?

C: YES!! That means OKD-compliant!

S: who’s that?

C: That’s Siouxsie! She’s from creativecommons.org.nz. I met here at #kiwifoo

S: Wow! PMR’s with Creative Commons as well.

C: so where’s the monkey?

http://www.zazzle.com.au/kids_peerj_t_shirt-235730159800116640

S: What a friendly monkey! What’s its name?

PMR: I don’t know.

S: We already know lots of Open Animals. Chuff, Gulliver, Tux, GNU, Python. Do they make stuffed blue monkeys?

PMR: don’t know.

S: well they should. We want more friends

C: Who wrote the article?

A: (after working out the characters). Michael P Taylor // dino@miketaylor.org.uk

C: That’s because he loves dinosaurs. He fights for Openness every day.

S: Perhaps we should get a toy dinosaur.

PMR: NOT a Barney, please!

C: So is there anything about Okapis in the article. Okapis are giraffids! Spelt O-K-A-P-I

A: (searches) Yes. (Quotes from article)

Toon A, Toon SB. 2003. Okapis and giraffes. In: Hutchins M, Kleiman D, Geist V, McDade M, eds.

Grzimek’s animal life encyclopedia, Vol 15: Mammals IV. second edition, Michigan: Gale Group,

Farmington Hills, 299–409.

C: Wow! Maybe Dino Taylor has a copy. It’s over 100 pages!

PMR: let’s find out what AMI2 has discovered about the typesetting!

Posted in Uncategorized | 6 Comments

#rds2013 Principles for Managing Research Data

Posted on February 13, 2013 by pm286

These are thoughts for my 15-minute session at #rds2013. Feel free to comment. I’d particularly like to know of any F/OSS that manages timed slide presentation on Windows so I don’t have to use Powerpoint. I have 900 seconds including 5 at each end for stepping up and stepping down. I shall refuse to be introduced – it’s all in Wikipedia (http://en.wikipedia.org/wiki/Peter_Murray-Rust ). It’s therefore essential to have timed transitions, a la PechaKucha. The cryptic notes here will be elaborated in each detailed blog post. The order is random and the numbers of principles will change.

Management of data is a state of mind, not a process or technology. Follow Ranganathan.

The world owns the data, not you.

Use CC0. (see Ross Mounce’s work on licences).

The data you work with is provided by the universe of things and ideas. It is yours to nurture, refine and evangelize, but not yours to own.
You do not fully understand the potential of your data.

Encourage downstream use. Data increases in value with refinement, subtraction, and addition. Example: The historic observation of a Chinese eclipse has been used to calculate the coefficient of dynamic viscosity of the earth’s mantle.
Walled gardens destroy the potential of data and innovation.

Walled gardens, however benign, control access and seriously limit innovation and re-use. You cannot get all of the data out for Open re-use. Examples: Sciverse, CCDC crystallographyReaxys, Chemical Abstracts. Now , Mendeley. Will Figshare remain unwalled for long?

#animalgarden have made a 3.5 minutes video (http://vimeo.com/34323486, there won’t be time to show; it will exercise all your emotions).
Build the memex
for data. (http://en.wikipedia.org/wiki/Memex )

Manage data without noticing. Sourceforge/Github capture our code with zero effort, because we want to use them, not because we have to. We can do this for data. Turn instruments, laboratories and authoring systems into memexes. If you have to “put it in the repository” the system has failed.
Revere the long-tail.

Most data is in the long-tail of science, collected in individual laboratories on unique protocols and strange instruments. This can only be tackled by giving scientists toolkits for informatics and allowing them to build the solutions.
Text, data, audio, images, movies are different views of “data” – scientific truth.

They must all be free. The idea that scientific images, video, audio “belong” to people or institutions must be challenged. They are all CC0.
Mentor young people in data and let them mentor you

Young people have a different, fearless attitude. I’ve seen them attempt the impossible. Sometimes they succeed. (Sophie Kershaw (doctoral student) has been mentoring Oxonians in how to manage data )
The problems of data are people, not storage or bandwidth.

A computational chemistry program solves Schroedinger’s equation. If you publish the results in full the company will send the lawyers.

I can mine 500,000 reactions from patents (and my colleague Daniel has). Elsevier won’t let me mine any. Nor will ACS. Or the others. These restrictions destroy imaginative thought.
Develop Patterns for Data

Cameron McLean has shown me how the architects have patterns for building. These were adapted to patterns for software. He’s adapting these to research. We don’t yet have patterns for data.
Honour Tim Berners-Lee’s 5 stars of Linked Open Data.

Yes. Open Data, Open standards, Open links and Open minds.
Work collaboratively.

Share tools and ideas. Use hackfests. The library should run hackfests. Not for academics, For everyone. You would be surprised who you get.
Computing and Bioscience have got it as right as possible.

Emulate them. Use their tools. Create communities like theirs.
Build your own tools, don’t buy anything.

“Rough consensus and running code” built the Internet and the web. Build, test, teardown, rebuild. Building teaches you. Buying things numbs your imagination, Renting information is even worse.
Get out more.

Wikipedia was built by non-academics. Academics sneered (and some still do). Wikipedia is the future of scientific information, Steve Coates built OpenStreetmap, Galaxyzoo brought in hundreds of thousands of citizens. Academia neglects the #scholarly poor – non-academics (everywhere) facing daily paywalls.
Campaign for change.

Read and honour Aaron Swartz. Mail your representatives. Blog. You don’t have to go to jail if enough people protest.
Use domain repositories

Institutional repositories don’t work – for science and for data. We must create our own. Commercial ones will be constraining and controlled.
Start bottom-up Communities.

Wikipedia is a bottom-up community. It creates not only knowledge but models of governance. We’ve created the Blue Obelisk for chemistry

PMR: has been involved in all of the above and will no doubt think of more.

Posted in Uncategorized | 6 Comments

#rds2013 0. “Managing research data” at Columbia NY. I am not constrained in what I say.

Posted on February 12, 2013 by pm286

I’m setting out what I want to present at Columbia #rds2013 in a series of posts. I’ve got 15 mins to present- this is good discipline and it means I have to work very hard to prepare – I can’t do a stream-of-consciousness/conscience. So I’m preparing ca 10-15 titles. This post is not a title but sets out the basics.

When I was invited to present – by Kathryn Pope – I jumped at the chance. This is clearly a high-profile meeting and comes at a very good time in my thoughts and actions. So I said yes.

But then it became clear that I would be sponsored by Elsevier (who are co-sponsors of the meeting). I have taken a public stance against Elsevier’s practices (along with many thousands of others) and refuse to referee, publish or act as editor for Elsevier publications. I have additional reasons from Tim Gowers:

Elsevier have actively built walled gardens around science (Sciverse, Scopus] and act as kingmakers, controlling scientific information practice and hence thought.
Elsevier have systematically dragged their heels on content-mining. Others have had similar experiences. They intend to create licences, unilaterally, without which their material cannot be mined.
They set ultra-restrictive clauses on what universities and their staff can do with subscription content. No indexing, no crawling.
They have wasted my time over 3 years with fobbed off promises. I have documented these and reported to the Hargreaves process.

Accepting sponsorship from Elsevier would therefore be double standards, so I declined the invitation. However Kathryn offered to pay independently of Elsevier so I have accepted with many thanks. This gives me the freedom to say what I feel rather than feel constrained by being sponsored by an organization of which I disapproved.

To do justice to this topic will require hours, not minutes. I shall therefore post my titles and then annotate each with a blogpost (I hope to finish before the meeting). I’ll then create approx. 15 slides with detailed internal timed transitions so that the whole will run automatically and last exactly 900 seconds (similar to a pechaKucha or Ignite). The next slide should be the titles…

Posted in Uncategorized | 1 Comment

Thoughts on leaving Aotearoa

Posted on February 12, 2013 by pm286

I’m flying back to AU from http://en.wikipedia.org/wiki/Aotearoa (NZ). I’ve got 15 mins of free wifi left (this blog will be randomly updated).

Here’s an example of a species (unknown to me, but

No doubt Wikipediable) at Murawai on 2013-02-10. A piece of information. By itself of little value. But maybe of value when integrated into a larger Open database. Maybe the number of limbs is unusual, or the colour? Who knows. But recording the whole of our planet’s natural history must surely be a small contribution to preserving it.

Here are two humans (Fabiana and Cameron) who looked after me so well – drove me to the gannet colony. Fabiana is already working out bits of my future (nuff said here).

Posted in Uncategorized | Leave a comment

I request Elsevier to make experimental data CC0 and release crystallography from CCDC monopoly

Posted on February 11, 2013 by pm286

I have sent the following email to Elsevier’s Director of Universal Access (“very passionate about expanding access to information”). In summary I request that Elsevier publish all supplementary information (past, present, and future) and that by 27^th Feb she gives me an unequivocal commitment that this has happened.

Dear Director of Universal Access,

I have been invited by Columbia University, NY, to give an opening keynote at their “Managing Research Data” symposium on Feb 27^th
http://library.columbia.edu/news/libraries/2013/2013-1-31_Research_Data_Sympsosium_Announced.html ). Elsevier is among the sponsors, though (at my request) not of me. Among the recommendations I shall be making is that all primary research data should be published under CC0 (or equivalent) licence which allows anyone anywhere to do anything with it for any legal purpose without permission or negotiation, to re-use, modify, copy and repost. In my mind this is what “Universal Access” means.

This letter is to ask Elsevier, through your department, to make all supplemental data accompanying Elsevier publications , retrospective and future, available under CC0. I will treat all mail from you as public and announce your reply/s at the symposium.

I will restrict my examples to small-molecule crystallography though the argument extends to all primary scientific data (observations, instruments, computation, etc. in all disciplines). Crystallography, through its International Union (IUCr) has pioneered the imperative to publish all primary data (diffraction, cleaning, structural solution and refinement – and more). FWIW I am privileged to sit on the IUCr’s COMCIFS committee which creates the protocols for this.

Note that other major publishers (Nature, Acta Crystallographica, ACS, RSC, etc.) have no problem making their data available in the way I have described.

This publication enables many things including:

 The verification/validation of the experiment being reported. There are many ways of doing this including reprocessing the data with new algorithms, comparison with other data sets, recomputation, etc.

 The re-use of the data to build knowledgebases both in and outside the domain. Crystallography has a century of showing the value of the re-use of data and its interpretation.

 Creating of specialist services for alerting scientists to the publication of data.

As an example Nick Day in our laboratory collected 200,000 structures from the primary literature in http://wwmm.ch.cam.ac.uk/crystaleye. This resource, published under PDDL (equivalent to CC0) contains several features not found elsewhere including bondlength browsing and fragment browsing. In particular it has a unique feature of linking back to the original literature.

There are no Elsevier data in this, because Elsevier makes it impossible. Elsevier currently hides this behind a 42 USD paywall (Polyhedron) or – in a closed agreement with The Cambridge Crystallographic Data Centre (CCDC). I have no details of this agreement (CCDC refused to respond to my FOI request) but it gives a monopoly right to CCDC to be the holder of this data. CCDC sell a derivative product and only allow miniscule amounts of the data (ca 25 structures per year) on request. This is completely inadequate for what modern information-based scientists wish to do. It leads to bad science as the primary data cannot be reviewed and cannot be incorporated in new artifacts (CCDC forbid re-use of the data even though it is the primary scientific record).

I am therefore asking you do the following:

 Announce that all supplemental data accompanying Elsevier papers IS licensed as CC0.

 Require the CCDC to make all primary CIF data from Elsevier publications CC0. (The author’s raw deposition, not CCDC’s derivative works)

 Extend this policy to all other experimental data published in Elsevier journals (in chemistry this would be records or synthesis, spectra, analytical data, computational chemistry, etc.). When you agree to this I can give public advice as to the best way to achieve this.

I assume your division has effective power to do this on the timescale I have indicated. Note that in our past discussions you have used phrases such as “let’s talk to your librarians”, “we are reviewing this internally”, etc.) Any phrases of this sort will be interpreted as a refusal to make data CC0. Only a clear public commitment to make raw author data CC0 with target dates (e.g. within a month ) and an unequivocal public letter to CCDC requiring CC0 for raw CIFs can be regarded as Universal Access to raw author data.

—
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069

Posted in Uncategorized | 5 Comments

@okfn_okapi from #kiwifoo

Posted on February 11, 2013 by pm286

Greetings,

I have had a great time at #kiwifoo – I travel in PMR’s backpack with my head looking out. I’ve made lots of friends but haven’t met any kiwis (we wanted to go yesterday to see them but the ferries don’t run on Mondays. So next time, and there will be one).

My biggest problem is that people don’t know what sort of animal I am. They ask “is it a cross between a donkey and a giraffe?”. Oh dear. Imagine someone not recognising a GNU. Flanders and Swann wrote a song http://en.wikipedia.org/wiki/The_Gnu – from http://www.nyanko.pwp.blueyonder.co.uk/fas/hat_gnu.html

I’m a Gnu, ,A g-nother gnu
I wish I could g-nash my teeth at you!
I’m a Gnu, How do you do
You really ought to k-now w-ho’s w-ho.
I’m a Gnu Spelt G-N-U,
Call me Bison or Okapi and I’ll sue

Honestly the idea that anyone could mistake an Okapi for a Gnu. Now RMS has popularised the GNU http://en.wikipedia.org/wiki/GNU , so we need something for OKAPIs.

I’ve got a fantastic laser-cut label:

Surely humans can read that? But maybe they’re not looking in the right direction. So I asked Janine Torkington for another label for a different direction:

Humans, can you read that? I will wear it all the time as a badge of honour.

And I’ve had lots of ideas for ChuffActivities at hackfests. Here are some:

An Okapi song. Why should only the GNU get one?
MakerChuff (or Make-a-Chuff). Design your own chuff (paper, material, subtractive or additive technologies).
PhotoComics. Shoot your own #animalgarden photocomics. We’ll supply the animals and the camera (though you’ve probably got some anyway). You provide the text in the speech bubbles.

Posted in Uncategorized | Leave a comment

petermr's blog

#ami2 Farewell talk at CSIRO

Martin Hall, VC Salford, justifies the RCUK policy and his insistence on CC-BY

Content-mining: #ami2 and #animalgarden continue to parse scientific PDFs into semantic form

A: Yes. I have a document which fails with CambriaMath. We default to Unicode so CambriaMath-4666 defaults to ETHIOPIC SYLLABLE SHI (U+123A) which looks like:

Content-mining : #animalgarden and #ami2 read Dino’s PeerJ article; is it technically OK?

P: Yes. http://en.wikipedia.org/wiki/Zapf_Dingbats are one of the “Standard Type 1 Fonts (Standard 14 Fonts)”:

#animalgarden review PeerJ articles – 1

#rds2013 Principles for Managing Research Data

#rds2013 0. “Managing research data” at Columbia NY. I am not constrained in what I say.

Thoughts on leaving Aotearoa

I request Elsevier to make experimental data CC0 and release crystallography from CCDC monopoly

@okfn_okapi from #kiwifoo

Recent Posts

Recent Comments

Archives

Categories

Meta