Can machines understand politicians?

Dictated to my computer (please report speakos! – the equivalent of typos)
I was hoping to get some coding done but I have just discovered another brilliant site from the usual suspects at mysociety.org. It’s called the straight choice and encourages people to upload election leaflets. Although many election leaflets are produced nationally I think the majority are done locally and it is in these in which some of the worst distortions of truth and the political process occur. By collecting and analysing these leaflets it is possible to keep a check on what the candidates say and claim.
[STRAIGHT CHOICE]
Electioneering is a high-stakes game. We, at The Straight Choice (http://www.thestraightchoice.org/about.php ), believe that it’s time for that game to become a spectator sport.

The Straight Choice is an real-time election leaflet project. Our ambition is to create a live visualization of the flood of party political leaflets as they are delivered across the country during an election campaign. If you have recently received any election leaflets through your door you can help by photographing or scanning them and uploading the images to our server.

The idea was conjured up in December 2008 at a weekend in Derbyshire, and finally acted upon in Francis’s front living room in Cambridge at the end of April 2009.

The name of the website is derived from a leaflet in the controversial by-election in Bermondsey in 1983 which has become the type specimen of accusations of dodgy campaigning.

The team

This website has been put together by three of the usual suspects who don’t have time or contacts to sell ideas or apply for grants. The code has been deposited at code.google/theelectionleafletproject.

Julian Todd had the job of pestering people about an election leaflet monitoring website after discovering just how crucially important these pieces of paper really are. In 2003 he wrote Public Whip with Francis Irving, which became the input for mySociety’s TheyWorkForYou.com. In 2007 he made undemocracy.com which applied the same idea to the United Nations General Assembly and Security Council. Julian and has volunteered his phone number (0791 6090736), should you be interested in talking to someone about this project

Richard Pope initially spent a couple of days writing the code for this site and making it look pretty for the European Elections in 2009. During the run up to the 2010 General Election he has been working full time on the project on a voluntary basis expanding the website and dealing with the press. He is the brains behind the Planning Alerts project, Groups Near You and StreetWire. In 2005 he made theElection Memory project to record and publicise manifestos of the different parties in the Lambeth local elections.

Francis Irving is the other half of Public Whip . He has done substantial work on mySociety’s WhatDoTheyKnow.com — among other projects — and would specifically like you to sign up for Serious Change.

Thanks

Our biggest thanks go out to the hundreds of volunteers who have kindly uploaded leaflets in recent weeks and months. This site couldn’t exist without you.

Thanks also to mySociety and Democracy Club for their support and publicity in terms of promoting the site to interested citizens across the country.

In terms of project development, Richard Pope has put a lot of free time and effort into developing the code since April 2009, with a cash contribution of £3000 from Julian and use of his server. As the load has increased in recent days owing to the election, the service has been upgraded by Donovan Hide to Amazon S3 technology with a contribution from ScraperWiki.

We have also had the support of a great bunch volunteers who have helped promote The Straight Choice by delivering leaflets during by-elections. If you’d like to help out, please get in touch we’d love to hear from you.

Donovan Hide helped get all the images uploaded to and served from Amazon’s S3 service.

FAQs

How do I upload an election leaflet

You need to upload a photograph of the leaflet in JPG format, then enter a few details about the leaflet. Click here to get started.

I’m a party activist, can I upload a leaflet?

Yes. Just upload a photos of your leaflet and enter one of the post codes it is aimed at when prompted. Click here to add a leaflet now.

Who can reuse the images

By uploading an image to The Straight Choice you are allowing free reuse of the image. For this site to make an impact on the way electioneering is conducted it’s important that the dodgy ones get as wide an audience as possible, and this helps it happen. If you would like to make sure that proper attribution for a particular image please let us know.

Is this project affiliated to or supported by any political party

No.

Can I reuse the images on my website or blog?

Yes, but please make a copy of the image on your site and link back to us.

What do you mean by a leaflet? Does a letter count?

Any kind of written communication – letters, leaflets, flyers – contain useful information. If in doubt, upload it anyway or get in touch with us.

Can I send you leaflets by post?

Yes. Please send them to:

The Straight Choice c/o Scraperwiki, LSP 2, 146 Brownlow Hill, Liverpool L3 5RF

and mark each leaflet with the postcode (or postcode district) it was delivered to.

Contact

You can get in touch with us via team [at] thestraightchoice.org or phone Julian on 0791 6090736.

Thanks and acknowledgements

Linking leaflets to constituencies is made possible thanks to the TheyWorkForYou.com API

[PETERMR]
As our technology can now analyse written text (using machine learning and other methods) and I would be interested to put the various textual messages into classification software. The main problem is getting it from photographs such as JPEG and converting this into machine-readable ASCII. There are two main ways of doing this;

  • Crowd sourcing. http://www.thestraightchoice.org/analyze.php It shows how a number of volunteers have been analysing leaflets. This is probably the best way at the moment given that there are many people who will see the value and have fun in doing this. However I think that the primary effort is to add metadata rather than to extract full text.
  • Machine learning and text mining. In principle it might be possible to do optical character recognition on the leaflets but I suspect the results will be awful. If the Crowdsourcing could be extended to typing up the content of the leaflets (if and most of them also vacuous that the effort is relatively small) then it would be possible to use well cried classification techniques to categorise the leaflets.

He is a typical example from Julian Huppert’s leaflet (I have dictated this and am impressed with the relative speed and accuracy. I cannot bring myself to mouth vacuous statements so the job is not a large one and it should be possible to capture the bulk of the text in the leaflet in two or 3 minutes. Anyone who has a speech recognition system should be able to do this. )

  1. It’s wrong and that so many of Britain’s old people can’t afford to heat their homes properly. Julian Huppert and the lib dems want a fair deal for the elderly – proper home insultaion [sic] and lower heating bills.
  2. All children should get a fair start in life with the best possible education – regardless of their background. The lib dems would cut class sizes and ensure children in poor areas got a helping hand.
  3. One of David Howarth biggest successes as Cambridge’s MP was to help lead the campaign to stop Brookfield Hospital from closure. Cambridge residents don’t get their fair share of public services from the government.
  4. Labour’s record of shame.
  • Hiding the truth about the invasion of Iraq
  • Failing on climate change
  • Introducing student fees despite promising not to
  • Trying to close Cambridge is Brookfield Hospital
  • Doubling tax on the lowest earners

I would use these sentences as training material for the classification program such as classification4j a simple but extremely effective java program. For example the first sentence could be classified as belonging to “elderly” and “energy”. By feeding in a few tens of such sentences we train or machine to recognise the type of language used. If we then feed into a new sentence the classifier will tell us whether it is about energy or the elderly or neither. Of course it sometimes makes mistakes but it is never deliberately devious. In similar vein two is about education, three is about health and four is an attack on opponents. It might be possible to sub classify this into Iraq, climate, student fees, health, and tax. I do not expect a machine to “understand” the argument (simplistic though most of the politicians’ arguments are) but we should be able to categorise them as “liberals attack labour on Iraq”, “liberals attack labour on climate”.
Of course given the excellent track record of MySociety and their usual suspects they may already have done this. If not maybe there should be a space in a site for uploading the text of the leaflets until eight PI for downloading the contents.

This entry was posted in Uncategorized. Bookmark the permalink.

2 Responses to Can machines understand politicians?

  1. David Jones says:

    “Croats or scene” for crowdsourcing. Is it useful for people to point out the machine dictation mistakes?

  2. Pingback: Twitter Trackbacks for Unilever Centre for Molecular Informatics, Cambridge - Can machines understand politicians? « petermr’s blog [cam.ac.uk] on Topsy.com

Leave a Reply

Your email address will not be published. Required fields are marked *