Prediction Markets vs. Super Crunching: Which Can Better Predict How Justice Kennedy Will Vote?

One of the great unresolved questions of predictive analytics is trying to figure out when prediction markets will produce better predictions than good old-fashion mining of historic data. I think that there is fairly good evidence that either approach tends to beat the statistically unaided predictions of traditional experts.

But what is still unknown is whether prediction markets dominate statistical prediction. (Freakonomics co-blogger Justin Wolfers, in a sense, is on both sides of this debate. Justin is one of the best crunchers of historic data, and even more, he is at the cutting edge of exploiting the results of prediction markets).

Thanks to Josh Blackman, we are about to have a test of these two competing approaches. Blackman has organized a cool Supreme Court fantasy league, where anybody can make predictions about how Supreme Court justices will vote on particular cases. The aggregate prediction of the league members is powerful “wisdom of the crowds” information.

And it is natural to ask whether the predictions of the league are more accurate than the predictions of a statistical algorithm developed by Andrew D. Martin, Kevin M. Quinn, Theodore W. Ruger, and Pauline T. Kim. I wrote about their study in my book Super Crunchers (you can read the excerpt about the study from the Financial Times here).

On a “Prediction Tools” website, I even created a JAVA applet based on the study where you can generate your own predictions of how Justice Kennedy will vote:


That’s right. For a particular case before the court, just plug in answers for the six questions (such as “the ideological direction of lower court decision”) and the applet will predict whether Kennedy will affirm or reverse the lower court opinion.

Looking at four cases before the current court, Josh has compared the statistical predictions of the applet to the initial aggregate predictions from his fantasy league:

The first case we consider is Maryland v. Shatzer, which considers whether or not police are barred from questioning a criminal suspect who has invoked their right to counsel when the interrogation takes place nearly three years later. … The second case we consider is U.S. v. Stevens, which considers whether a statute banning depictions of animal cruelty is facially invalid under the Free Speech Clause of the First Amendment. … The third case we consider is Bloate v. U.S., which considers whether additional time granted at the request of a defendant to prepare pretrial is excludable from the time within which trial must commence under the Speedy Trial Act. … The fourth case we consider is Salazar v. Buono, which considers whether an individual has Article III standing to bring an Establishment Clause suit challenging the display of a religious symbol on government land and if an Act of Congress directing the land be transferred to a private entity is a permissible accommodation.

How do the members think Justice Kennedy will vote? Predictions of the 10th Justice after the jump:

In Maryland v. Shatzer, 43 percent (123 out of 267 voting members) agreed with the program, and predicted that Justice Kennedy would vote to affirm the Lower Court.

In U.S. v. Stevens, 83 percent (168 out of 201 voting members) agreed with the program, and predicted that Justice Kennedy would affirm the Third Circuit. While predictions for Maryland v. Shatzer produced weaker results, a stronger agreement in this situation may indicate that certain criteria are clearer predictors of behavior and observers of the court pick up on them much more easily.

In Bloate v. U.S., 76 percent (61 out of 80 voting members) agreed with the program, and predicted that Justice Kennedy would affirm the Eight Circuit.

In Salazar v. Buono, only 45 percent (48 out of 106 voting members) agreed with the program, and predicted that Justice Kennedy would affirm the Ninth Circuit. While the difference between the two predictors is murky, FantasySCOTUS predictions are much more flexible since they are not subject to the “category” constraints the program uses and would probably be the more accurate indicator in this situation.

Thanks to Josh’s creation, we’ll be able to sit back — paying particular attention to instances of disagreement — and see over time which approach makes the better predictions. This single experiment will not, by itself, resolve the larger “which is better” debate — in part, because I could imagine putting forward stronger market-based and statistical-based predictions. The fantasy league predictions would probably be more accurate if market participants had to actually put their money behind their predictions (as with And the statistical predictions could probably be improved if they relied on more recent data and controlled for more variables.

But we are bound to see more meta-methodological comparisons like these in the years to come — which will also shed light on whether market participants will learn to efficiently incorporate the results of statistical prediction into their own assessments. At the moment, individual decision-makers tend to improve their prediction when given statistical aids; but they still tend to wave off the statistical prediction too often.

Jayson Virissimo

Ian, you should try contacting Robin Hanson. He is a "traditional expert" on this subject.


It's an interesting contest, but I'm curious, is the question "Which is better?" really an answerable one? How would you go about proving or disproving that data mining works better than predictive markets? Which data miners? Which analyses...?

T Student

what about fantasy players who use your java applet?

Jacob Berlove

As I commented on fantasySCOTUS, I tried the model out in Bloate on Josh's terms and actually got "overturn". So before drawing any conclusions, make sure to rule out human error.

Guido Z

Very interesting! I do agree that the prediction market could be improved if the participants were risking their own money; I'd take it even one step further and suggest that there should be a system by which the "votes" are weighted differently from one another. In a real market, a very certain and very well capitalized participant can move the market price by taking on all bets that offer inefficiencies.

Imagine that Justice Kennedy himself is playing this game. In a traditional "wisdom of the crowds" setting he only gets 1 vote. But if he's allowed to "vote" 1000 times by betting against 1000 other people (who clearly are not as well informed), he can cause the prediction market to provide the correct result even when the masses are misinformed.


Thanks, thanks alot. How am I supposed to do anything productive now (i.e. prepare for law school finals) with this website floating out there, only a mouse click away? I blame you for any future poor performance on my behalf. Jerk.

Ted H

The prediction market should marginally outperform the statistical estimation if people aren't stupid; if they are stupid the statistical estimate should work. Why? Because both procedures are mimicking the exact same thing. The statistical estimation is making a guess based on historical votes and the the prediction market is just an aggregate of people's guesses based on historical votes. If people are intelligent, the people's results should be superior because they are performing the same historical data-mining; but they can include past similar cases and other factors the statistical analysis does not.

Of course, this assumes that people are intelligent, possibly a large assumption; that people will do the effort to research the historical voting patterns on similar cases, which the money incentive would help increase; and finally that Kennedy's voting pattern isn't just random, which I think is reasonable to assume heh.

This test is more interesting because it's a test whether people are stupid. The statistical procedure your questions invoke should be done by individuals in the predictive market, coupled with further data mining so the prediction market's results should be better if people aren't stupid.

(I include "stupidity" as apart of falling into cheap cognitive biases that are common in markets - which I think we already know happens a lot).



I thought to make prediction markets accurate, the person has to actually bet (or pretend to bet) on the outcome. Otherwise, it's a compilation of guesses.

Je couPe

I concur. With No. 8.

It's my understanding and belief that the high accuracy of prediction markets is especially due to that there's "standing," or "consideration," namely, real money and gain/loss in play.
Corny taunt not withstanding, "you gonna put your money where your mouth is" is literally a "game changer," to admittedly overuse yet again another recent cliche.

The other, "metafilter" result seems just like a lot of people sitting around at a bar talking, blah, blah, blah, with no stake or consequence.
As such, it would only be as good as what it it puported to replace: agglomerated "statistically unaided predictions of traditional experts."

FWIW, or that's my two cents.

Adam L

Nos. 8 & 9,

I believe that the 'fantasy league' described in the article addresses your exact concern. So long as participants care about their performance in this league, there is something on the line. Similar to the Hollywood Stock Exchange, on which no real money rides but predictions are remarkably accurate, players in the league will hopefully derive satisfaction from high performance/standing in the league.

Or are you seeing something that I'm not?

Robin Hanson

I responded here:


Prediction markets should be more accurate than a statistical model if the outcome is based on a large number of variables, the variables are not necessarily binary, and the relative importance of the variables is not fixed. Cognitive errors are not necessarily less present in the statistical model, because somebody still has to make the judgement call "yes" or "no" for each variable.

A statistical model can accurately predict whether somebody is willing to go on a first date; probably less accurate as to whether somebody is willing to go on a second.