Lunatick Scrimt

#quixotechem #okfn

The Sprint methodology for software is a popular way of developing flexible software quickly and well . A Scrum is a similar approach, with colourful team roles such as chickens and pigs:

All roles fall into two distinct groups—pigs and chickens—based on the nature of their involvement in the development process. These groups get their names from a joke [6] about a pig and a chicken opening a restaurant:[7]

A pig and a chicken are walking down a road. The chicken looks at the pig and says, “Hey, why don’t we open a restaurant?” The pig looks back at the chicken and says, “Good idea, what do you want to call it?” The chicken thinks about it and says, “Why don’t we call it ‘Ham and Eggs’?” “I don’t think so,” says the pig, “I’d be committed, but you’d only be involved.”

So the “pigs” are committed to building software regularly and frequently, while everyone else is a “chicken”—interested in the project but really indifferent because if it fails they’re not the pigs—that is, they weren’t the ones that committed to doing it. The needs, desires, ideas and influences of the chicken roles are taken into account, but are not in any way allowed to get in the way of the actual Scrum project.

There are also in Scrum

  1. the “ScrumMaster“, who maintains the processes (typically in lieu of a project manager)
  2. the “Product Owner“, who represents the stakeholders and the business
  3. the “Team“, a cross-functional group of about 7 people who do the actual analysis, design, implementation, testing, etc.

Now I have launched 3 projects which have some of these characteristics but are not really sprints or scrums – so like Carroll’s frumious I’ll call them “scrimt”s. The feature of a Scrimt is:

  • You propose an impossible task within an absurdly short time. Hence “lunatick” [1]
  • You persuade/trick/seduce… a number of collaborators into sharing your mad vision. These people are making whatever contribution they do out of the goodness or madness of their hearts.
  • It starts from nothing and creates a working, convincing, sustainable prototype within a MONTH.
  • You pick a fixed date when the task has to be completed. This date is really fixed. To add a real thrill it can be to present a working system at a conference.
  • You do everything in full view of the internet.
  • There is no money. (This is now a universal truth anyway, but just to iterate this is an unfunded project with no monetary reward).
  • You are the only pig and scrum-master.
  • You can use whatever resources on the Net that you and your chickens can create/borrow/blag.

     

My three projects so far are:

1997-12 SAX. This is the most successful short collaborative project ever. In this I was the chicken and persuaded David Megginson to be THE PIG. He writes:

The process of developing SAX itself started on Saturday 13 December 1997, mainly as a result the persistence of Peter Murray-Rust. Peter is the author of the free Java-based XML browser JUMBO, and after going through the headaches of supporting three different XML parsers with their own proprietary APIs, he insisted that parser writers should all support a common Java event-based API, which he code-named YAXPAPI (for Yet Another XML Parser API).

Peter initiated a discussion with Tim Bray (the author of the Lark XML parser and one of the editors of the XML specification) and David Megginson (the author of Microstar’s Ælfred XML parser) about coming up with a single, standard event-based API for XML parsers. The design discussion took place publicly on the XML-DEV mailing list, and many people contributed ideas, comments, and criticisms (see below). At the end, Jon Bosak, the founder of XML, kindly allowed SAX to use his xml.org domain for the Java package name org.xml.sax.

David co-ordinated the discussion and wrote the proposal for the interface, together with its Java implementation. The first draft interface — together with front-end drivers for the four major Java XML parsers — was released on Monday 12 January 1998, one month less a day after the beginning of the discussion. This could be a record for an industry initiative (especially considering that SAX was finished under a declared state of emergency, during the worst ice storms in Canadian history, when much of Eastern Ontario and Quebec were without power).

The first draft of SAX received much attention, and over several months, users identified shortcomings and suggested improvements. Over a long period of discussions and pre-releases, the XML-DEV community developed SAX 1.0, which was released on Monday 11 May 1998, less than five months after SAX was first proposed.

Every week David sent out questions for the XML-DEV community to respond to. And they did – a hundred of them. I am sure the overall design was David’s, and it was very clean and compelling. But that’s the virtue of a single PIG.

SAX is now in every computer on the Planet. One, hectic, frantic month to prove it could work.

2010-08 The Green Chain Reaction. I was THE PIG. I set us the task of analysing about 100,000 experiments in European Patents to see whether the solvents were getting greener over time. I did not have a working system. I had bits. These included a PMR-lashup of the Lensfield system; David Jessop’s patent reading software, Lezan Hawizy’s Chemical Tagger, Sam Adams RESTful server code. I had volunteers – half of whom I didn’t know who did a fantastic job in downloading and testing the software and proving that a distributed system could work. It was a sort of map-reduce – the humans farmed out the maps and then reduced them back to the server. It didn’t help that the chemistry dept switched off the electricity for the two days before and during the demo! (It was planned – but do I read emails?). Dan Hagon did a fantastic job in cloning the server and making the demo work.

Is it sustainable. I hope so. I’ll be presenting it to 200 chemists tomorrow at the Dial-a-Molecule meeting – a Grand Challenge to automatic the design and synthesis of chemical compounds. We know that the literature is critical – we have to use patents because the publishers don’t allow text-mining of “their” articles. So science is held back by narrow-minded commercialism. There will be a phase 2 when we have the new Lensfield and the new OSCAR and the new Chemical Tagger.

2010-09 Quixote. This is really barking mad. It’s a project to do in a month with 0 dollars what 20 million dollars failed at a US National Laboratory over many years. To build a self-sustaining distributed Open knowledgebase for computational chemistry. Again I am THE PIG, but I have active chickens and piglets. A month ago we committed to a working prototype on Thursday (2010-10-21) because Lance Westerhoff happened to be visiting Cambridge. We fixed and published the day. We have about 44 hours left. The plan is to automate the process of:

  • Publishing Open computational chemistry data
  • Crawling the data-sites and downloading
  • Converting to semantic form (Chemical Markup Language – CML)
  • Converting to RDF
  • Uploading the data (of all sorts) to the GreenChainServer (or clones of it).
  • Building a search and indexing system

In one sense we started from scratch. We didn’t know each other a month ago. There was no project in embryo. Everything needed designing.

But there was a huge substratum of Open source and open practices. There are many Blue Obelisk programs and libraries which have been designed to work as components. And here’s the remarkable thing – because we develop Open components we create much more modular software than the commercial companies in chemistry. They have to work to create lockin and monolithic applications while we build the system from at least 5 Blue Obelisk projects. And because we use the natural language of the web – CML/XML – RDF, ANTLR, REST, etc. and because we don’t have to worry about security and commercial confidence and so on we can move much faster.

So it’s almost all in place. I’ve just finished bolting in Weerapong’s (very nice) library of RDF generators and CMLComp dictionaries. There is no equivalent anywhere else – this is several years ahead of the game. It makes computational chemistry (traditionally a minefield of FORTRAN punch files) into a set of interoperable components. It’s not finished but it’ll serve the same role as SAX1 did after its month-long scrimt.

And what next? I hope to take Quixote to Materials and India.

But also we intend to have a Bibliography Scrimt, starting now. Can we OPENLY index all the books in the world? Not, perhaps in a month. But we can prove the concept. So, With Ben O’Steen, Mark McGillivray, Will Waites, Rufus and some others we will start to get the chickens together.

Who wants to be THE PIG?

[1] Lunatick does not mean “mad” in this context. It has a special and precise meaning which has a strong analogy to the way we run the Blue Obelisk. You will be doing very well if you work this out!

 

 

 

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *