Mining Enron's E-Mail
In our recent podcast called “The Dilbert Index,” we explored the idea of workplace morale. A recent study by Eric Gilbert, “Phrases That Signal Workplace Hierarchy,” provides an interesting window into who says what within firms, and why. From the abstract:
Hierarchy fundamentally shapes how we act at work. In this paper, we explore the relationship between the words people write in workplace email and the rank of the email’s recipient. Using the Enron corpus as a dataset, we perform a close study of the words and phrases people send to those above them in the corporate hierarchy versus those at the same level or lower. We find that certain words and phrases are strong predictors. For example, “thought you would” strongly suggests that the recipient outranks the sender, while “let’s discuss” implies the opposite. We also find that the phrases people write to their bosses do not demonstrate cognitive processes as often as the ones they write to others. We conclude this paper by interpreting our results and announcing the release of the predictive phrases as a public dataset, perhaps enabling a new class of status-aware applications.
The usage of “weekend” in a work email, for example, is likely to be sent to a superior. This is also true for the words “attach,” “that night,” “tiger” and “shit.” Here are the top five in each group – upward means the recipients of the email outrank the sender, and downward the opposite:
Top 5 Upward Predictors
- the ability to
- I took
- are available
- kitchen
- thought you would
Top 5 Downward Predictors
- have you been
- you gave
- we are in
- title
- need in
Keep in mind these findings are from the body of Enron e-mails, which may or may not be remotely typical. Gilbert notes: “the models build on data from a profoundly dishonest company which ultimately fell apart. At the same time, the Enron email corpus is without parallel in the research community. Nowhere else can you find such a rich, complex and naturally occurring email dataset.”
And, while Gilbert doesn’t comment on what e-mail language says about employee morale per se, it’s not hard to imagine that a similar study could try to do so. If you were authoring that study — and perhaps you are — what do you suspect you might find as indicators of low, or high, morale?
Comments