Software development is hard. Tedious. Frustrating. It usually takes much longer than anyone, including the author, thinks. So what tools and philosophy are useful to the solo – or near-solo – Open Source programmer? Here are some thoughts which you’re welcome to challenge and improve.
Of course we should good design, code re-use, testing and so on. And, where possible, create the community where synergy adds insight, quality and productivity. But it doesn’t always work that way. Many projects are essentially single-person because they are too individual for communal design, because ideas are not tested, because they’re likely to have constant revision, because you can’t expect someone else to go through the pain on your behalf. Because the idea is half-baked and only the originator can see it through. Later – when the idea is proven – a community may develop. But until then how do we go ahead?
Certain ideas from the world of mainstream software development are proven. Documentation. Unit testing. They work. Yes, we find them tedious and we try to neglect them but at all stages they pay back. So these are taken for granted. But what else – if anything – works for the solo programmer?
Many of the tenets of programming style assume a team of paid developers working in a well-funded project. Or at least a funded project. Whereas solo Open Source is usually done in marginal time – when you really should be asleep. Extreme and agile programming doesn’t work: pair programming? there isn’t anyone to pair with; 40-hour week? yes – 40+40 = 80.
The much maligned waterfall model? (even though it’s not
as mindless as often protrayed). Not in its classic form (from WP):
- Requirements specification
- Design
- Construction (aka: implementation or coding)
- Integration
- Testing and debugging (aka: verification)
- Installation
- Maintenance
But we do need some sequential discipline, and here’s mine. We start with an Idea. At this stage it looks like a good thing to do. Sometimes it starts from nowhere – sometimes it’s the obvious thing to do, or even the essential. There can be no “requirements” at this stage – it is often pure experimentation.
OSCAR started this way – it wasn’t called OSCAR then – it didn’t have a name). I had the simplistic idea that we could parse chemical language easily. If I had realised how difficult it actually is I would probably have abandoned the whole thing and we would never had any OSCARs. But I found some regex code, tried it on some papers, got some simple results and started our collaboration with the RSC. We were at least 10 years out-of-date in our approach – but this gave us the opportunity to meet our colleagues in Natural Language Processing and thence to do it properly (SciBorg). But this was all a learning process – we couldn’t have created a properly structured project at the start as we had no idea where we were going.
We try the Idea out but soon need a roadmap. At this stage it’s important to create a Design. Without a design, especially when you are exploring, the code thrashes around. A clear indication of lack of design is difficulty in writing code, whereas with a good design the code can sometimes almost write itself. For me, XML has been invaluable as the design tool. Yes, it’s an end in itself for some of what I do, but it’s also an extremely powerful constraint and guide for programming. I often find that all non-transient data structure can be exported in XML and indeed helps the structuring of the code.
But Design without Implementation is dangerous. Far too many protocols are developed without being fully or even implemented. “Rough consensus and running code” (IETF) – the design must be implemented. And, as we said, good design supports implementation.
But often things go wrong here. The implementation doesn’t work out. And that’s a clear indication that the design is wrong. It has to be revised. Sometimes it needs simple additions. In the best cases it requires deletions – it is a great feeling when code can be simplified. Non-programmers don’t appreciate this – they look at a simple beautiful design and say – “that’s obvious” – whereas what we know is “how much work it took to make it simple”.
Sometimes the Design cannot be easily rejigged. That means we have to go back to the Idea. Change what we are trying to do. Or maybe even scrap the whole Idea. And that could be months or years down the line. (I spent a whole year wrapping the W3C DOM in a CMLDOM. It was a nightmare. But I had to work through it to show it wouldn’t work. It was “the right way to do it” at the time. Now we know that the W3C DOM is totally broken.)
JUMBO has been scrapped 4 times – we are now on JUMBO-5. It’s taken years. But at least now it works, the Design is clean and stable. It has to be right. Of course each new regeneration teaches us something. But none of this survives in the final product.
So we finally have an implementation. It’s not much use if no-one uses it. So it has to be Disseminated. Giving an iterative progression (with backtrack and restart):
Idea-Design-Implementation-Dissemination
Each step is a lot harder than the preceding one. Maybe half an order of magnitude. It doesn’t always work in this strict order. But generally it’s only at the end that other people can really start to collaborate – because if you do this too early you risk your prematire Design or Implementation crashing on them. And that’s not fair.