The "Found Haiku" program was a Perl program written by Don Marti back in February of 2000. You can see its genesis at: http://zgp.org/linux-elitists/20000222164552.A4441@humulus.zgp.org.html In early 2002, Ian Shuttleworth wrote to NTK to solicit contributions to his "poetry found in spam" project. I thought Don's haiku-discovering program would speed matters up, and wrote to Don to ask him for the code. Don replied to say he'd lost the source of his haiku finder, but pointed me to a pronunciation dictionary that could provide syllable counts - the trickiest bit of writing a program to trawl English text for 5-7-5 constructions. Reassured that Don had done the hardest part, I've hacked up a Python version. INSTALLING HAIKU It needs Python2.2 and grep. You'll also need the Carnegie Mellon Pronunciation Dictionary from http://www.speech.cs.cmu.edu/cgi-bin/cmudict . The code looks for a file called 'c06d' which you can get by either going to the above URL and following the instructions or (and no guarantees that this will work): wget --passive-ftp -O - ftp://ftp.cs.cmu.edu/afs/cs.cmu.edu/data/anonftp/project/fgdata/dict/c06d.gz | zcat > /usr/local/share/c06d USAGE haiku [ -c ] [ -s ] [ -h ] [ -d file ] input... -c = output only verses that begin with a capital letter -s = output only verses that end with a full stop (period) -p = do not guess the syllable count of unknown words (be precise) -h = output this helpfile -d = give the location of the c06d file (by default haiku checks '/usr/local/share/c06d', '/usr/share/dict/c06d', and 'c06d' ) The program will find many many "haiku" constructions in normal English texts[1]. Most found verses make no sense. Verses ending in a full stop often make sense. Verses beginning with a capital letter and ending with a full stop very often make sense. [1] Including this one. BUGS AND TODOS If haiku can't find a word, it uses Greg Fast's Lingua::EN:Syllable heuristic (taken from his Perl module) to guess the syllable length. Greg specifically warns against using this in haiku-like scenarios. Not that Damien Conway paid him any heed. (I think my implementation is buggy actually, but that should only show up if you don't have the CMU dictionary.) It's an American pronunciation dictionary. You can't add to it. haiku should download the dictionary file itself if you haven't got it. There should be a Perl version, because not everyone (not anyone) has python2.2 yet. This documentation should be written in verse. And as a manpage. Code should be generalised to clerihews. This product may encourage the generation of joke haiku that contravene http://www.phenry.org/junkdrawer/haiku/ (although it may also be seen as a lesson in how simple joke haiku are to construct) What haiku produces aren't really haiku. They're not even really senryu, as far as I can gather. UNDOCUMENTED FEATURES On the other hand, Andy writes: > > Those haikus are about as natural as pepsi-cola. Just because something > has seventeen syllables doesn't make it a haiku. > > Unless, apparently, you're a programmer, then you count up to seventeen > and declare all the requirements for haiku satisfied. I declare that while the 'haiku' script may fail in its apparent overt aim, in its covert secondary aim - that of PISSING ANDY OFF - it is a unalloyed success. Hoorah! CONTACT Mail me at with bugs, patches and suggestions. EXAMPLES Some samples found using 'haiku -c -s' from www.ntk.net, slightly edited: "Hal turns against its human masters and tries to kill them," Kevin writes. Sadly, it's just the wrapper that goes blue when it's put into the fridge. Download it. It looks so beautiful doesn't it? So tasty. So sweet. So sweet. What a kind man that Mr Gates is, to give you it for free. Without getting too technical (as if we would), this is a bad thing. But good on T-shirts, fun to read, and cheap just like it says on the tin. Are we going to sell out? If we are, we are getting the hell out. Think of it as a test to see whether you were paying attention. and, of course: NEED TO KNOW. THEY STOLE OUR REVOLUTION. NOW WE'RE STEALING IT BACK.