Extreme Programming for Small Scientists?

We have a new autumn intake of researchers into our Centre and are aware that there are constantly changing demands on the software and informatics skills needed. In Big Science projects there is provision for infrastructure and training and well-developed methodology for the creation of software. We’re working out what is appropriate for “smaller” sciences like chemistry.
“Small” is not inferior – in fact it can have advantages, allowing faster and more diverse activity. But there is less formal support and software usually has to be done in the margins of projects. Software per se has little positive formal reward in science as it is the research in citable papers that matters to the evaluators, regardless of the value to the community.
So how do we develop a good, modern, software environment and lead people to best practices? Today I’ll start with Extreme Programming (XP) which talks a lot of sense (I quote from Wikipedia):

Extreme Programming Explained describes Extreme Programming as being:

  • An attempt to reconcile humanity and productivity
  • A mechanism for social change
  • A path to improvement
  • A style of development
  • A software development discipline

and

… five values are:

and summed up in 12 practices, grouped into four areas, derived from the best practices of software engineering:

Fine scale feedback

Continuous process

Shared understanding

  • Coding Standards
  • Collective Code Ownership
  • Simple Design
  • System Metaphor

Programmer welfare

  • Sustainable Pace

Now, XP is aimed at teams of developers in commercial organisations creating saleable products against whose success the team can be measured. Whereas small scientists are often working singly on projects with no positive software metric. So can XP (and it has many critics) be relevant?
I think some of it can. The five values are de facto attributes of a successful Open Source project, so by adopting Open Source (especially on a distributed global model) you have to adopt these. You cannot grow a successful group if they do not communicate, write simple code, give feedback (bugs), have extreme courage (more than XP demands), and have respect. So if we can translate the Open Source values into local practice then we have imported these values, regardless of the size of the projects. Of course there has to be some shared goal, but most research departments probably provide some of that (there will, of course, be some individuals working in such new areas that they have no natural companions).
Of the 12 practices some are only applicable to commercial and quai-commercial organisations (perhaps in Big Science). So my list is something like:

  • Pair Programming. Where possible someone else should work alongside you some of the time (“mentoring” could be a better word). The second person need not be an expert programmer, but may be good at designing information or act as a rubber duck.
  • Test Driven Development. Absolutely essential. Junit tests (more in later posts) have revolutionised my programming – I couldn’t libe without them.
  • Whole Team. Not easy, as not everyone belongs to the same team, but valuable if possible.
  • Continuous Integration. Yes. Things change so fast that we cannot work with infrequent large releases. Working on Sourceforge we are used to nightly builds and welcome them. Of course the nightly builds have to pass the Junit tests!
  • Small Releases. Again with the Sourceforge mentality this is standard practice. It does require careful attention to APIs – too many changes and the re-users get disillusioned. For example I decided to refactor the namespace for CML (there were just too many variants) to a single namespace for all time. One of my valued users told me that he would just about tolerate this, but any more and I was dead meat!
  • Coding Standards. Difficult to enforce socially, but happily the tools (at least in Java) implicitly set standards. Tools such as PMD are very useful and we can hopefully standardise on a set of style guides which are not too picky
  • Collective Code Ownership. Again any Sourceforger gets used to this. But it can be more difficult within a real-world group.
  • Simple Design. Fundamental, but not easy to learn or teach. Like architecture. You know it when you see it! So emulation is a good approach and constant code review by others. The balance between YAGNI and anticipation of requirements is difficult.

I add to this things that seem obvious to the commercial developer but by no means so natural to the Small Scientist

  • Use an integrated development environment (IDE).  These are now Open and very impressive. We use Eclipse for Java and are going to recommend it to everyon in the Centre.
  • Standardise on libraries. Again that is difficult in some cases, but we can now do this with Blue Obelisk for chemical informatics. After all, some of us have spent enough time developing it!
  • Present software projects to the assembled group even if they are on different projects. Be honest – what went wrong is often more valuable than what went right.

So – if you have read this far – we would be very grateful for any feedback from other Small Scientists in similar positions.

This entry was posted in programming for scientists. Bookmark the permalink.

23 Responses to Extreme Programming for Small Scientists?

  1. JamesM says:

    I’m yet to be entirely convinced by XP, but I believe that several practices that have been associated with ‘agile’ development are extremely useful and very suitable for the sort of environment academic groups are in, especially as the alternative is normally no methodology whatsoever.
    In general I believe that to be useful computational chemist/cheminformaticians (or whatever the plural is), the following basic skills are required:
    1. Knowledge of a whips-and-bondage, static, compiled language, like Java or C++. Preferably Java. Ideally Java 5.0.
    2. Knowledge of a decent scripting language, like Perl, Python or Ruby. If they already know Perl, I say let them keep on using it; otherwise, choose one of Ruby or Python (or both).
    3. Knowledge of a text editor like Vim or Emacs.
    4. Shell scripting (probably Bash).
    Additionally, the following practices have changed me for the better, in order of importance:
    1. IDEs, specifically Eclipse. Often, dynamic language proponents claim that the reliance of Java developers on IDEs is due to its verbosity compared to Python and Ruby. This is only partly true. TDD is so much easier with an IDE, as is refactoring. I have found it only mildly difficult to get my colleagues to use IDEs – however, many of them continue to labour with Fortran 77. Hard to know how to help such folk.
    2. Unit testing, and test first development. It takes a lot of discipline, but it’s worth it. Alas, JUnit is now showing its limitations, and the new 4.0 version hasn’t added much – maybe I should give TestNG a go. Getting people to do this is a bit harder than adopting an IDE. It just feels weird to begin with, so people stop. I think you have to basically force them for a bit, or they just go “I tried it, but it’s not for me”. Perhaps waiting for their code to have a serious bug, and then laughing loudly at them will have an effect. I’ve not tried this (yet). At any rate, I would never go back to programming without unit tests or Eclipse.
    3. Version control, using Subversion. Very useful, if only for collecting together different versions of programs that normally get scattered across several hard drives. I keep meaning to start using Cruise Control, but I’ve yet to get round to it. Getting people to adopt subversion is probably hardest of the three – it involves having to remember new commands, especially if they’ve not been persuaded to adopt Eclipse. Additionally, even using Subclipse, I regularly manage to irrevocably mess up the projects I’m working on, which more than a little annoying.
    4. I find FindBugs immeasurably superior to PDE or checkstyle. I found PMD required a lot of tuning.
    However, to use these tools effectively, I think a certain level of basic programming ‘education’ is needed. It’s one thing to know JUnit, another to be able to do TDD, for example. Knowledge of refactorings and ‘code smells’ is also vital. And effective object oriented design just can’t be done without an awareness of design patterns. Unfortunately, I know of no quick way to impart this knowledge. Personally, I found that making a huge evil mess of procedural code followed by lots and lots of reading was an effective teacher, but not a very efficient (or cheap) one. University courses don’t seem to teach it, either – it’s hard to teach more than the basics of Java in ten hours.

  2. pm286 says:

    (1)James – this is brilliant – thanks very much. I agree completely on A 1-4 (we favour Java, Python, probably vim (since it likely to be on every unix) and bash. So 4/4
    Fortran – most have moved to Fortran90/95. This is forced by history. If you have 1 million LOC then you have to stick with it 🙁
    Absolutely agree about Unit testing. There are few cases where I code first and test after (when I haven’t much clue where I am going so I can’t test it). But I was really pleased to hear one of our (very good) just-graduates that that he was going to build tests for his new project and then write the code. And that was not because I had pushed him… We’ll look into TestNG – hope it runs under Eclipse…
    SVN – we are committed to it.
    I agree with your last para as well. There is no royal road… but the road is better than it used to be (computed GOTOs???)

  3. JamesM says:

    I should also have mentioned Ant as part of the Eclipse/JUnit/svn bundle. Of course, a Wiki really helps with coordinating the use of all this stuff. I’ve found that’s basically as important as the actual programming practices. Then again, getting people to use a Wiki is also a hard sell…
    TestNG has an Eclipse plugin: http://testng.org/doc/eclipse.html
    I like TestNG if only because it has an assertEquals method for arrays, the lack of which drives me potty in JUnit. Unfortunately, I very often would like to compare the contents of an array with a Collection (like an ArrayList), which you still can’t do in either. (I also continue to be baffled by the lack of assertNotEquals in the Java testing frameworks.)
    The thing that gets me about Fortran is that it’s the one language that desperately needs unit testing and refactoring tools the most, but has almost nothing – there’s the Photran Eclipse plugin, and, er, that’s about it.

  4. pm286 says:

    (3) Yes. We have moved to maven over ant – it handles the dependencies more cleanly and it has good support for packaging. I have found that Wikis work when there is a high project motivation – not otherwise.
    I was also frustrated by the lack of array equality and have written my own. I have found that for many classes I have had to write assertEquals(Foo foo1, Foo foo2) – once over the psychological hurdle of deciding this it’s not normally difficult to do.
    Fortran… we have some local superheroes who have actually written a DOM/SAX library for it (FoX). Not complete functionality, but a big step forward.

  5. Jim Downing says:

    My take is that we need to focus on education rather than setting requirements of people.
    You’re definitely painting a utopian picture in which chemoinformaticians never bemoan the failings of their tools, but boot up eclipse, improve the tool (or write a new one) according to best software practises and then add it back to the OS commons. But I think you hit the real issue, James, at the end of your post.
    Getting a complete tech novice up to this level of code-fu is a hell of a process and there are dangers in rushing it. e.g. until someone has a decent grasp of OO (which takes experience and time to absorb good examples), their unit testing efforts will waste more time than they save.
    If we want to motivate people to follow this path, let’s concentrate on giving them a learning path that starts with working with command line tools and text editors and has a progression all the way to being top notch programmers.
    How do we do it? By letting them write stinky procedural rubbish at first, but making sure it gets reviewed regularly before the pile gets too big. (At Cambridge) By frogmarching them up to Computing Services for training at the many available opportunities. By giving people code buddies to tap on the shoulder when those small questions1 arise. By investing in learning materials (howtos, cheat sheets etc) ourselves, and encouraging learners to add to them as they go. I’m yet to be convinced by the merits of pair programming for teaching, but hell, let’s give it a go and see if it works.
    For anyone who’s about to embark on learning how to prgoram; Java isn’t really a whips-and-bondage language at all, except maybe in a pink-fluffy-ann-summers-highstreet-window kind of way.
    1 One of the small question questions I got recently was along the lines of “what’s going on when java gets compiled?”. Which is a corker of a “by the way” question!

  6. The real issue with fortran, even more than legacy code, is performance – nothing else, even now, comes close to the raw numeric performance of Fortran. A lot of that is down to libraries like GotoBLAS, sure, but if you’re doing electronic structure or large MD then fortran’s still the only game in town.

  7. Matt Wood says:

    Whilst many of the XP practices are designed with commercial, end user software in mind, I think that they are equally (if not more) relevant to scientific computing. Whilst a bug in a user’s browser is regrettable, it can mean a lot of wasted compute cycles (and no small amount of embarrassment) ininformatic science. I do agree with you Peter, that a lot of pressure is placed on the end results, rather than the process. But I don’t think it’s too hard to argue that there is benefit in taking an agile development stance in small science.
    Similarly to James, I find huge benefits in testing and version control, but would certainly like to sing the praises of continuous integration.CruiseControl is a great tool for Java developers (if a little tricky to setup and maintain for larger projects). I wouldn’t be too hard on Perl though: news of its death have been greatly exaggerated. Very few languages have such a wealth of extra, ready to roll functionality and documentation as can be found in CPAN.
    Rather than enforcing agile practices (‘have you done you commit today?’ – ouch), I think the best way to educate those around you is to lead by example: write your tests first, commit early and often, set up a version control system if one doesn’t already exist. Then evangelize about the courage these practices instill into you and your development process. This gives those around you direct experience of the benefits of things like TDD and continuous integration, and will hopefully provide the inspiration for them to start using them in their own development.
    I think the secret is in creating a research ethos that encourages agile development (and the slightly longer learning curve that lies within).

  8. pm286 says:

    (5) (6) (7). Many thanks to all – keep the ideas coming in. The idea of this post was not so much to “announce” what we are going to do, but to see what ideas others have to offer. If X says “we work in [this community] and by doing [this] we agree that things are better and people are happier” that’s very useful. As indeed is the negative – “we tried to get [this practice] working” and it just didn’t take off.
    We are certainly not going to enforce things, but rather to lead. We shan’t have conventional pair programming but have announced software groups to look after newcomers. I haven’t had any flak yet… We are certainly not going to explicitly (or surreptitiously) check up if people have committed but we want to make sure that they are putting their code up on SVN even if it’s primarily for “backup”.
    Even tests are a problem. They are useful if you know what you intend to do, useful if you know that you are going to require a toolset as you do it, less easy if you haven’t much idea where you are going. Sometimes software is a way of exploring ideas, knowing it may not work out. But in my library routines (JUMBO) once I have finished playing catch-up – I have about 10% of my 2000 tests in @Ignore – then I will test first, write afterwards.
    So we always try to keep a sense of proportion. This isn’t easy for newcomers, as the whole area is bewildering. We are throwing a lot at them; but we hope the group structure will give some reasssurance.

  9. Matt Wood says:

    8: I think software support/mentoring groups are a really good idea. For those just getting started in informatics and scientific computing, it’s great to have someone to steer you towards modern best practices. One postdoc I did featured semi-regular reviews of shared code I had written. This was a great help in introducing me to new approaches, and encourged me to think more about how best to approach a problem. Very useful.
    I would certainly recommend the following:
    1. For frameworks that are anticipated to be reusable (for example, a collection of text mining, machine learning or graphical display routines). I have found unit tests, test first design and version control to be a real help with this kind of code base, especially if they rely on other tools and packages out of your control (as is often the case in Java).
    * SVN is great, and although CVS has fallen from grace recently, it is still a viable option, IMHO. Mac SVN users should check out (boom boom) SVNx, a great desktop app.
    * Clover (or Cobertura) are useful for assessing code coverage.
    * Personally, I found code style tools such as Checkstyle to be very counter-productive.
    * In my experience, IntelliJ IDEA is well worth a try (for novices or experienced Eclipse users). It did suffer from poor SVN support when I last used it, though.
    2. Tools made available primarily over the web benefit greatly from automation, continuous integration (along with the above!).
    * CruiseControl and ant are the big dogs in the Java world, and work really well. I can’t imagine working with large Java code sets without them.
    * Capistrano is a really useful automatic test and deployment system built with Ruby, used a lot in Rails apps.
    * Both functional and acceptance testing are made delightfully simple with ThoughtWorks’ Selenium. This is especially useful for testing web apps that make use of AJAX.
    * Eclipse has some great support for web apps with a built in Tomcat server, a feature missing from IntelliJ.
    From an XP point of view, I think the concept of exploring a domain by coding is interesting. Is there a consensus on the best tools to use for this sort of investagative coding?

  10. pm286 says:

    (9) Thanks again Matt – quite a number of tools that are new to me. Probably Jim knows about them!
    P.

  11. JamesM says:

    Matt – exploring a domain by coding, is that all about spike solutions when designing?

  12. pm286 says:

    (11) JamesM, – please, what are spike solutions?

  13. ojd20 says:

    6) That’s a strong endorsement from Andrew, who is (as far as I can tell) a confirmed Quiche eater like myself 🙂

  14. ojd20 says:

    9. Matt, could you expand on why you found checkstyle counterproductive? I’ve found it helpful in the past, but only after customizing it heavily to be several leagues less draconian than the default.

  15. ojd20 says:

    12. To cut into a thread, Peter, spike solutions are ones which aim to solve a specific technical unknown, and don’t aim to integrate with the other concerns / requirements of the software. They’re most often used in commercial software development as a risk management tool.

  16. ojd20 says:

    9. Tools: My (and most people’s) big grudge with CVS is that it fouls up so badly at the simplest refactoring – changing a class name.
    Thanks for the tips on tools – because we’re going down the Maven road I’m looking at Cargo for automated deployment and Continuum for CI and will blog the good and the bad. Will need a webapp test solution though, so will check out both Capistrano and Selenium.

  17. ojd20 says:

    11. To finish my morning salvo, a negative thought about XP.
    Spike solutions / discovery through coding is (are, if they’re different) techniques I’ve found especially useful when designing architectures for systems. XP appears to have no place for them. When I tackled a Thoughtworks XP-er on this a while back he claimed that they still did that stuff, it was part of “choosing the system metaphor”!
    XP must be a religion – even its high priests have to bend the rules!

  18. Matt Wood says:

    Spikes: As mentioned by others above, spike solutions seem particularly useful in small science. I have found it useful to explore a domain (or new API, new data set, etc) by writing the explorative code as a test case – building up from a simple starting point (load the data, include the API) to some kind of “solution” (simple stats on the data, a sweet use of the API). That way, the new code (although only intended to be temporary) is easy to run and validate, is self documenting, can be checked in, and re-run on new API versions/data as necessary. It’s surprising how often I have refer back to this example source later, despite it not having any defined problem to solve itself. Not sure if it’s the best way to go about things though – XP certainly seems to require a preset target to work well (love that metaphor quote!).
    With regards to checkstyle – I certainly see value in maintaining a coding style across classes, I just found the analysis to be too picky, and the output to be too verbose. Instead of enforcing a particular programming style, I prefer to adhere to a coding style defined by myself and others working on the code. With a small to medium size team, that “optimum” coding style (for all) seems to develop naturally, without conforming to a predetermined “standard”.
    That said, if checkstyle could be configured to work within that organic style – that would be cool, especially for new team members.

  19. 13) I just managed to submit an entire computational physics/algorithms PhD perpetrated without writing a single line of Fortran (you can get a heck of a long way, if you’re not writing the DFT/forcefield code yourself, by writing optimizers and gluecode in another language…) so I’m not just a quiche-eater, I’m at least a sous-chef in the Ersatz Programmer Deli.
    That doesn’t change that problem, though; for high-performance highly-parallel (MPI-type) code what options *are* there, realistically, apart from F90? It’s the compilers and the linear algebra libraries.
    If I were going to advocate anything, it wouldn’t be getting rid of F90 – it would be doing as little as possible in it, and moving all of the non-performance critical code – which can include pretty much everything upto and including conjugate gradients, MD, transition-state search etc, much of the time – out into a separate process, preferably in a high-level language. I like Python (especially with SciPy), as you know, but Ruby’d be a sensible choice too; Java’s still too low-level. That really encourages the kind of exploratory algorithm design I like doing. All of that’s my own prejudices speaking, of course!

  20. ojd20 says:

    Spikes: so the real question is to what extent you insulate main code from spike code. I usually develop my spikes in the main code, but I wouldn’t recommend it and have been trying to kick the habit. The testing of spike code is usually patchy, ineffective and brittle. Until you understand what can go right, you probably don’t understand what can go wrong, so you can’t write accurate tests.
    Other times I’ve aimed to throw the first one away and developed the spike in a little scratch project and then copied code across to a production project. This is great for working out how to use a new technology or new API, but doesn’t work if the whole point was a big refactoring to introduce a novel design pattern or whatever.
    What else works? Is it possible to characterize the problem in order to select the best approach?

  21. pm286 says:

    (20). Jim asks “What else works?”. I probably can’t help, but what I find myself doing too often is starting to refactor or spike without realising I am doing it and suddenly the whole lot is broken. Since many of my innovations are at the method or class level I often have things like:
    getMoleculatWeight()
    followed by
    getMolecularWeight1()
    where the second is the exploration. Then, at least, things can continue with working code. Personally I have yet to feel comfortable with branching (HEAD, trunk, etc.) because I had bad expreiences with CVS (couldn’t remember how to check out the branched version). However I expect Eclipse/SVN has made this a lot easier so maybe mini-branches can help this.

  22. JamesM says:

    There’s a tension to spike solutions. You want to be able to try stuff out quickly, but you also want to be able to make something decent from the results (eventually). What enables the first tends to make the second more difficult.
    On the one hand, I’m orders of magnitude more productive (in a raw code sense) with Python and Ruby than Java. On the other hand, when it’s time to fix the code and make it work, the lack of automated refactoring slows me down. As does the unit testing. Despite the fact that Python and Ruby are less verbose than Java, unit testing still requires *some* scaffolding. Whereas Eclipse basically does all the tedious JUnit scaffold writing for me, there’s no equivalent functionality in PyDev and RDT (the Python and Ruby Eclipse plugins, respectively). So, perhaps surprisingly, I find it faster and easier to do disciplined development with Java than the dynamic languages.
    The upshot is that my spike solutions have a tendency to turn into unit-test-free production code. This is especially pernicious with undergraduate project students, who generally can’t program, don’t want to learn, wouldn’t have time to learn anyway, and probably don’t even want to do a computational project in the first place. Then the time pressures of coming up with working scripts and applications for them to use begins to combine irresistably with the siren song of the spike-solution-as-production-code.

  23. Pingback: Unilever Centre for Molecular Informatics, Cambridge - petermr’s blog » Blog Archive » Knowledge-limited, not time-limited

Leave a Reply

Your email address will not be published. Required fields are marked *