AMI is learning how to interpret scientific diagrams. She’s done a lot of text, and vector graphics – now she is moving to images. There’s no simple solution – we have to try a lot of different approaches. Here she is trying to see if letters and number can be recognised by breaking down into straight lines (segmentation). Here’s a typical subset of the image – it might be a (not very good) TWO.
AMI calls up her helper Skippy who has brought her troop of young programmers. They’re not very well disciplined, but Skippy will keep them in order. They divide up the project:
- AMI tells Skippy WHAT to do and in what ORDER. That’s called an ALGORITHM. I’ve asked her to try thehttp://en.wikipedia.org/wiki/Ramer–Douglas–Peucker_algorithm. It sounds hairy but actually it’s very easy – as long as you have Skippy to keep the roos in order. Here Skippy tells the roos they each have to sit on one pixel for the whole time.(This algorithm is so simple that most of the time the roos just have to keep quiet.
- AMI has brought along Ralphy bear who will keep a record (MEMORY (technically a STACK))
- The first and last roos (note each roo has a number (ID)) create a straight line between them. Is this good enough ? We ask Jimmy bear to decide – he’s got a measuring tape and he finds the roo who is farthest from the line. Number 4 is way out – above the limit.
- So now we have to make two straight lines:
Now we’ve got two separate problems 0-4 and 4-12. They’e completely separate and we can tackle each independently. If we don’t keep good track we’ll be in trouble (that’s why we need Ralphy). And if we don’t do things in the right order we’ll also be in trouble (that’s why we need Skippy).
So we’ll divide 0-4 into 0-2 and 2-4 (roos 5-12 are getting bored and Skippy has to bring them to order)
We’ve done 0-2 (#1 is close enough to the line. Here we do 2-4. Is #3 close enough? Bear says yes…
So we go on to 4-12. (“0-3 can go and play cricket” says AMI -“They’ve done their bit”.
But #9 is way out of line. We’ll need to split the lines again.
and now all the roos are close enough to the line.
We’ve split the diagram into 5 lines. This algorithm is widely used for geographical objects such as maps. It’s really valuable to segment them as we can then compute areas, work out which countries touch, etc. We’ll be using it for chemistry and phylogenetic trees and diagrams in general.
You might think the roos had very little to do. But we’ll see examples in the future where each roo needs a measure of independence. They may need to know who their neighbours are, for example. Or whether they have been used. This is Kangaroo-Oriented Computing (KOC).