skip to main bit
a man slumped on his desk, from 'The Sleep of Reason Produces



Lengthy subject matter — email subject lines do seem to be getting longer!

I spent a little time over the weekend pursuing my theory that email subject lines have grown longer over time, based on the surprising terseness of subjects I observed in an old inbox of messages from 1998.

I have two ginormous corpora of outgoing and incoming email: one from about 1999 which fades out into 2007, and one picking up the slack from 2007 onwards. In total, they contain 2,732,487 messages. I measured the subject line length of each of these messages, threw that into a database along with their date, and plotted the average length for every day in the corpus.

Ta da!

The trend line suggests that subject lines indeed  been lengthening by an average of 1.2 characters a year since 1999.

What’s going on here? Is it just me? Is it just my email correspondents who are getting more long-winded? Did I make a mistake? If you’re curious, you can check my working, and try it out on your own emails: here’s the code I used to create the graph above. If you have email archives of your own to measure (and speak a little Python), that code can slurp up your email in mbox, Maildir formats, or in a notmuchmail database, and plot the results using matplotlib. You can also try out different processes my conclusions with the 115MB sqlite database of my own subject line lengths, available for download, or as a torrent.

Some potential explanations I’ve been mulling. It could be an artifact of a growth in mailing lists traffic whose subject lines are prefixed by a mailing list name (ie “[mylist] hello everyone”). I could check that by removing square brackets or mails with mailing list headers.

It could be a rise in marketing email (though probably not spam), which might have different characteristics from artisanally-crafted individual emails. Literature survey time! People who send mass marketing email really care about subject line length (or at least the people who market to email marketers like to write about it, when they’ve run out of other things to write about). One of these studies, which analysed 9 million emails, let drop that the average length of 9 million emails sent in February 2015 was 41-50 characters, which seems to suggest that at marketing mail at that moment in time matches my average, or maybe slightly shorter. (The most conscientious of these marketing marketeers, incidentally, conclude that subject line length makes no difference to email open rates.)

It might be related to growing screen sizes. If you have more horizontal space to type a subject line, you might tend to stuff more into it. I should compare it with this browser screen resolution dataset from statcounter. It’d be hard to make a causal connection from them both rising at the same time, but there may be some discontinuities in average screen size that might correlate with, for example, whatever weirdness is happening in the subject line in 2009-2011 (could just be my weird data, though). If monitor size is a factor it’s surprising that the rise of mobile hasn’t slowed the curve though. Or maybe it has: I could filter out mobile messages and see if they’re shorter.

Finally, it could just be that people just say more in email subject lines these days. Not sure how you’d check that specific factor: it would be good to confirm that, say, word length was going up also. Odd that it’s such a consistent process though.

What are your theories?

One Response to “Lengthy subject matter — email subject lines do seem to be getting longer!”

  1. Paul Says:

    I think its because of adding “You won’t believe what happens next!” to all subject lines from about 2006 onwards……

    That and desperation to try and get people to open emails instead of automatically filing or tossing them in the bin.


petit disclaimer:
My employer has enough opinions of its own, without having to have mine too.