I am at the Mathematical Knowledge Management 2007 and having a wonderful time. At present Neil Sloane is talking about his marvellous On-Line Encyclopedia of Integer Sequences a collection of every known (and voluntarily communicated) sequence. e.g. what is next term in:
and over 100,000 more
- 1964 started
- 1973 2500 seqs
- 1995 1 m**3 of mail
- 1995 5500 seqs
- 1996 10000
- 2007 131000
A large volunteer community – with a virtual party for 100,000 sequence. One has set the sequences to music (try “Listen” – this sometimes great fun).
2000000 lines of flat file, 120M edited with emacs – total 450 Mbytes including all info. 10K seqs/year, 30 comments/day, 600 emails/day
shortest seq 76337 (1 term)
longest a27 500,000 terms (natural numbers) – this raised a laugh. The point of including it that you can plot one sequence against another.
Used to find out everything about a sequence. Difficulty between conjectures and proofs.
Sequence fans mailing list. But The database is (rightly) restricted to a set of trusted editors. But ultimately almost all is done by Neil.
200 examples of sequence pairs which have identical content but are different sequences.
I find it wonderful that one can search Google with a sequence and it will hit Neil’s site (if it’s in the database). It’s one of the few digital objects where the content acts as an index in Google. InChI is another.
It’s fascinating. Every sequence that comes in is transformed but over 100 transforms to see what its structure might be. It’s a real living ecology of digital objects. Neil has shown us how datamining the database – comparing sequences with each other – has resulted in new mathematical theorems. Wouldn’t it be wonderful if chemical information was available for datamining and not owned by commercial interests?