The lightning talks worked very well at OSCON (and proved hilariously stereotypical : the Python talks were well-ordered to a Netherlandish extent, the Emerging Tech ones were largely performed by people with brightly died hair, and the Perl talks were even more ADHD than you’d imagine). At the end-of-conference press overview, Nat said he was going to go for lightning keynotes next year: 800 pundits in half-an-hour.
The five-minute talk that surprised me the most (apart from the Chinese rap) was Blog Census. Maciej Ceglowski has been writing blog-recognition software and has spidered out to pick up over 600,000 blogs. Not only has he been collecting the URLs and various global stats, he’s been archiving the entries, too. You can download everything he has: the list of URLs, the language, blog tool and number of incoming links, the current HTML cache. If you want it all, you’ll need to download around three gigs.
As he said in the talk, there’s loads you could do with this data, “but I’m not imaginative enough to think of it”. Nonetheless, he’s already found out some fascinating stuff. If his language-guessing algorithms are right, over 1% of icelanders have a blog; Poland, Brazilians and Iranians love ’em, but most of South America and Spain are nowhere to be seen.
He’s begging people to do something with this, and yet I haven’t heard a single mention of it on the blogs I read. Hell, there’s even an XML-RPC interface. How much more meme-worthy can you get?
I guess it’s a crowded market, what with blogdex, technorati and organica. But this is an academic project, and it’s open. Raw, crunchy downloadable data. At the very least, I bet the This entry was posted on Friday, July 11th, 2003 at 4:42 pm and is filed under Uncategorized. You can follow any responses to this entry through the RSS/Atom feed. Both comments and pings are currently closed.