A New Method to the Freakonomics Madness

Our regular contributors on this blog write pretty much whatever they want with only very occasional input from Levitt or me. I can’t recall a single instance where we had an issue over whether or not to publish something. But there’s always a first. We considered spiking the following post by Ian Ayres since it could cause us some substantial embarrassment. Ultimately, we decided to let it stand. Just keep in mind that we had nothing whatsoever to do with it — although I will say that Ayres is a helluva book critic. — SJD

I eagerly awaited and quickly devoured SuperFreakonomics when it appeared a few weeks ago. And while many reviewers are focusing on the substance of the book, I’m struck by two shifts in the Levitt/Dubner method.

First, SuperFreakonomics is more of an effort at problem solving. The original Freakonomics book showed how creative econometrics applied to historic data could be used to uncover the “hidden” causes of observed behavior. To be sure, SuperFreakonomics retains many examples of the hidden-side-of-everything data mining. But the new book is much more of a solutions book. It uses economic thinking to generate new ideas to solve really big problems. Levitt and Dubner are admirably leveraging the success of the first book to try to make the world a better place. They are on the lookout for concrete suggestions to reduce the lives lost from hurricanes, hospital infections, global warming, automobile accidents and even walking drunk.

In the original book, number crunching itself was the solution. Forensic number crunching could help identify whether Sumo wrestlers had thrown a match or whether Chicago teachers were cheating on test scores. In the new book, number crunching is instead used to verify that a particular solution (such as hand-washing or ocean cooling) is likely to work.

The Randomization Lens

The second methodological shift is subtler. The first book focused on historical data. For example, a core story of the original book looked at data on crime and abortion. In a truly inspired moment, Levitt (and his coauthor John Donohue) were able to show that legalizing abortion reduced the amount of crime — 18 years later. Mining historic data can produce truly startling results.

But a higher proportion of the new book is devoted to studies that use randomized field experiments to find out what causes what. If you want to know whether offering donors a two-for-one matching grant produces more charitable donations than a one-for-one grant, you randomly assign potential donors to receive one of these two solicitations and then look to see whether the two groups give different amounts.

One sign of the shift toward randomization is the prominence of John List and his rise to fame in the economics profession. John is one of the great field experimenters in economics today. He’s the kind of guy who goes to baseball card shows and at random treats one set of card dealers differently from another and then sees whether they offer different prices. (You can read an excerpt of the book’s discussion of List here).

SuperFreakonomics not only relates the results of more randomized experiments than Freakonomics did, it also explains how the idea of randomized experiments is leading statisticians to think more clearly about how to use regression analysis to test for causal effects with historic data. There is a new zeitgeist in the way economists think about running regressions. Today, statistical economists explicitly think of their regressions in terms of randomized experiments. They think of the variable of interest as the “treatment” and ask themselves what kind of assumptions they need to make or what kind of statistical procedures they need to run on the historic data to emulate a randomized study. This new way of thinking is very much on display in the truly excellent (but technically demanding) book, Mostly Harmless Econometrics: An Empiricist’s Companion, by Joshua Angrist and Jorn-Steffen Pischke. (I praised the book in a previous post because it “captures the feeling of how to go about trying to attack an empirical question….”). For example, Angrist and Pischke show that the regression-discontinuity design (which I’ll say more about in a later post) provides causal inference from historic correlation because it emulates randomized assignment of a treatment to otherwise similar subjects.

What Economists Would Really Like To Do

SuperFreakonomics very much reflects this new randomization lens as a way of thinking about data-mining. Without off-putting jargon, Levitt and Dubner explain how regressions can give you quasi-experimental results. Indeed, with help from my Kindle, I found three parallel descriptions that turn on making the randomization analogy. For example, listen to how they describe testing for sex discrimination on the job:

Economists do the best they can by assembling data and using complex statistical techniques to tease out the reasons why women earn less than men. The fundamental difficulty, however, is that men and women differ in so many ways. What an economist would really like to do is perform an experiment, something like this: take a bunch of women and clone male versions of them; do the reverse for a bunch of men; now sit back and watch. By measuring the labor outcomes of each gender group against their clones, you could likely gain some real insights. Or, if cloning weren’t an option, you could take a bunch of women, randomly select half of them, and magically switch their gender to male, leaving everything else about them the same, and do the opposite with a bunch of men. Unfortunately, economists aren’t allowed to conduct such experiments. (Yet.)

They go on to describe how, in the absence of randomized data, some (limited) progress might be gleaned by looking at the historic experience of transgendered people — before and after sex reassignment surgery. They take a similar approach when tackling the question of testing physician quality:

What you’d really like to do is run a randomized, controlled trial so that when patients arrive they are randomly assigned to a doctor, even if that doctor is overwhelmed with other patients not well equipped to handle a particular ailment. But we are dealing with one set of real, live human beings who are trying to keep another set of real, live human beings from dying, so this kind of experiment isn’t going to happen, and for good reason.

Since we can’t do a true randomization, and if simply looking at patient outcomes in the raw data will be misleading, what’s the best way to measure doctor skill? Thanks to the nature of the emergency room, there is another sort of de facto, accidental randomization that can lead us to the truth.

The “next in line” queue at some emergency rooms provides quasi-random assignments and allows researchers to emulate the results on a randomized test. The magic “really like to do” words appear a third time when Levitt and Dubner talk about testing whether more incarceration would really lower the crime rate:

To answer this question with some kind of scientific certainty, what you’d really like to do is conduct an experiment. Pretend you could randomly select a group of states and command each of them to release 10,000 prisoners. At the same time, you could randomly select a different group of states and have them lock up 10,000 people, misdemeanor offenders perhaps, who otherwise wouldn’t have gone to prison. Now sit back, wait a few years, and measure the crime rate in those two sets of states. Voila! You’ve just run the kind of randomized, controlled experiment that lets you determine the relationship between variables.

Unfortunately, the governors of those random states probably wouldn’t take too kindly to your experiment. Nor would the people you sent to prison in some states or the next-door neighbors of the prisoners you freed in others. So your chances of actually conducting this experiment are zero.

That’s why researchers often rely on what is known as a natural experiment, a set of conditions that mimic the experiment you want to conduct but, for whatever reason, cannot. In this instance, what you want is a radical change in the prison population of various states for reasons that have nothing to do with the amount of crime in those states. Happily, the American Civil Liberties Union was good enough to create just such an experiment.

The methodological repetition across these examples is one of the book’s strengths. This is really the way that many empirical economists talk to themselves about testing. Regardless of the problem, we often now start with the same basic question.

One of the great early stories from SuperFreakonomics is the finding that “even after factoring in the deaths [innocent bystanders from drunk driving], walking drunk leads to five times as many deaths per mile as driving drunk.” The substantive fact is not only surprising, but the story also metaphorically foreshadows the book’s new emphasis on experimental approaches. After all, what makes a drunkard’s walk so dangerous is that the drunkard lurches from side to side randomly.

[Addendum: The “embarrassment” in the introduction was meant as a wink — because while it’s customary for this blog to link to positive reviews of our books, it’s another thing to actually run one on the blog. Sorry for any confusion. — SJD]



Maybe I'm just dense. How is this embarrassing?

Someone is commenting on your analysis and methodology. What would you expect from the academics and other assorted eggheads that read your book?


Superfreakonomics makes an appearance in today's WSJ:


Quoting Bret Stephens:

"Grandiosity: In "SuperFreakonomics," Steve Levitt and Stephen Dubner give favorable treatment to an idea to cool the earth by pumping sulfur dioxide into the upper atmosphere, something that could be done cheaply and quickly. Maybe it would work, or maybe it wouldn't. But one suspects that the main reason the chapter was the subject of hysterical criticism is that it didn't propose to deal with global warming by re-engineering the world economy. The penchant for monumentalism is yet another constant feature of the totalitarian mind."

Bret refers to the Romm instigated smear campaign against Levitt and Dubner, that was a blog post earlier on this site.


1 -- I was wondering the same thing. My best guess is that they were worried about the attention being drawn to the fact that they'd "like to" do experiments that would generally be considered "inhumane," even though I don't think a reasonable person would understand the authors' intent and not be offended.


Sorry -- I DO think a reasonable person would understand the authors' intent.


Did the drunken walk experiment include the known fact that the distance traveled in drunken walk is proportional to the square root of the time spent traveling?


At #1, I think L&D are somewhat embarrassed because Ayres praises their book, and the authors don't want to seem like they're just using their blog to promote book sales.


It might be embarrassing due to the fact that economists face the same exact problems as other social scientists when evaluating and analyzing data with the hopes that they will get at the true nature of causation.

This is at the expense of trying to retain economics as the "first" social science. It is no more an exception to the problems that plague social sciences than psychology, sociology, or anything else that falls under that umbrella.


Perhaps it was the quantum fear that their mere presence would skew the objectivity of any book review published by one of their contributing editors.

Mike K.

I'm confused, too. Unless you were reluctant to publish this out of modesty, I don't understand the hesitation.


Drunks on foot may very well cause more deaths than drunks on wheels. But to conclude drunks should drive is irresponsible. To celebrate the discovery of this statistical fact strikes me as sophomoric pedantry at its worst. To attribute it to random lurches is ludicrous.

If that's what we can expect from the new economics I'd say the future looks bleak indeed.

Walter Wimberly

Unfortunately the "walking drunk" reference seems to be typical of researchers looking at one piece of info only. (From what I can tell - I'm eagerly awaiting the book for Christmas.) It ignores that drunk walkers don't travel as far as drunk drivers, and from a society view - people tend to not feel as bad when something happens to the drunk as they do when an innocent bystander gets hurt.

If you don't believe me, think about which would make headline news in your local paper "Guy Walking Drunk Killed By Car" or "Drunk Driver Strikes Little Girl Playing In Yard". Right or wrong, society does not believe that all life is equal.


Actually, I think Dubner meant that they could have been embarrassed by approving the topic of the post before the post was written (which Dubner&Levitt did). It turned out that the post wasn't so embarrassing after all.


Dubner seems to have a mighty low bar for embarrassment. This is a favorable review.

Joe Smith

Why is that embarrassing? What am I missing?

It seems to me that he is saying that rather than simply repeat the old book with new examples, you decided to move on to more interesting and important problems. At that pace, the third book is going to be a humdinger.


Are you applying randomization to deciding which posts to spike ? I don't see anything embarrassing about what Mr Ayres writes. Maybe you *think* he is accusing you of really wanting to run those experiments ? But he quotes enough of the book to show that this is not the case.


After reading Jeremy's post, I think I now understand the potential embarrassment. "What we'd like to do is put a bunch of people in an ice bath and see how long they survive.'' If a medical researcher were to say that, it would mean that this would really be a good way to get the information on hypothermia that we're looking for, not that he would for a moment actually want to do it. But in the case of a natural disaster (e.g., Titanic) he would certainly want to make use of any evidence collected.


Looking at the incentives of this situations, seems more like you wanted to nix this post since it contains some spoilers about the book, rather than it being "embarrassing" in any way.


You guys are thinking too hard about this.
Generally, anyone with a little professional modesty would find it a little embarassing for a colleague to give as positive a review about ones work as this. That such praise appears in ones own blog would also suggest a very transparent attempt to shill the book and, despite the note at the beginning of the post, will likely still attract such accusations regardless.


11 - I believe what is said in the book is that drunk walking results in five times more deaths PER MILE, so length of journey is actually taken into account.

Also, the drunken lurch wasn't pinpointed as the cause, I think the cause has more to do with a drunkard's lack of awareness and slower means of travel. Slower travel means more time to stumble into traffic on the way home.

They do NOT champion drunk driving in Superfreakonomics. Rather, they bring attention to the fact that walking home drunk is not be the safe alternative one may think it is. They mention that getting a cab, designated driver, or not drinking too much are all much safer options for getting home.


I haven't yet read the book, so maybe this is addressed therein. But, obviously people don't walk as many miles as they drive. It might be more informative to compare deaths per hour driving drunk vs. deaths per hour walking drunk.