What do the Kansas City Royals and my iPod have in common?

On the surface, not much. The Kansas City Royals have lost 19 straight games and are threatening to break the all-time record for futility in major league baseball. My iPod, on the other hand, has quickly become one of my most beloved material possessions.

So what do they have in common? They both can teach us a lesson about randomness.

The human mind does badly with randomness. If you ask the typical person to generate a series of “heads” and “tails” to mimic a random sequence of coin tosses, the series doesn’t really look like a randomly generated sequence at all. You can try at yourself. First, before you read further, write down what you expect a random series of 20 coin tosses to look like. Then spend 15-20 minutes flipping coins (or use a random number generator in Excel). If you are like the typical person, the “random” sequence you generated will have many fewer long streaks of “all heads” or “all tails” than actually arise in real life.

My iPod shuffle reminds me of this every time I use it. I’m consistently surprised at how often it plays two, three, or even four songs by the same artist, even though I have songs by dozens of different artists on it. On a number of occasions, I’ve even become mistakenly convinced I don’t have the iPod on shuffle, but rather, I’m playing all the songs by one artist. (If someone is really bored, maybe they can repeatedly have iPod shuffle the songs, record the data, and see if shuffle really is random. My guess is that it is random, because what would be the point of Apple doing something different? I have a friend, Tim Groseclose who is a professor of Political Science at UCLA, who was convinced that the random button on his CD player knew which songs were his favorites and disproportionally played those. So I bet him one day, made him name his favorite songs in advance, and won lunch.)

Which brings us to the Kansas City Royals. It seems like, when a team loses 19 games, that is so extreme that it can’t reasonably be the result of randomness. Clearly coaches, sports writers and most fans believe that to be true. How often have you heard of a coach holding a closed-door meeting to try to turn a team around? But if you look at it statistically, you expect 19 game losing streaks to occur, simply by randomness, about as often as they do.

The following calculations are admittedly crude, but they give you the basic idea. Each year, there are about two teams in the major leagues that have a winning percentage of around 35%. (Sometimes no team is that bad, in other years there are real stinkers like Detroit in 2003 — they won only 26.5% of their games.) The chance of a team that has a 35% of winning each game losing their next 19 games is about one in 4,000. Each team plays 162 games a year, so has 162 chances to start such a streak. (They count streaks that begin in one year and end in the next year, so it is correct to use all 162 games.) So each year, for these two bad teams that win 35% of their games, there are a total of 324 chances to have a 19 game losing streak. It takes about 12-13 years for these two bad teams to have a total of 4,000 chances for a 19 game losing streak. Thus we would expect a losing streak this long a little less than once a decade. In practice, we see, if anything, slightly fewer long losing streaks than expected based on these calculations. The last really long losing streak was by the Cubs in 1996-1997 — 16 games. (There is actually a good reason that long streaks occur a little less than in the simple model I was using. It is because a team that wins 35% doesn’t have the same likelihood of winning every game: sometimes it has a 50% chance and sometimes a 20% chance…that sort of variability lessens the likelihood of long streaks.)

So, one doesn’t need to resort to explanations like “lack of concentration,” being “snakebit,” or demoralized to explain why the Royals are losing so many games in a row, just that they are a bad team getting some bad luck.

Steamboat Lion

Stephen Jay Gould wrote a classic essay on the application of statistical reasoning to baseball streaks (my version is "Streak of Streaks" in "Bully for Brontosaurus" but I know it was published previously under another name).

Maybe it's the fact that I was actually paying attention in the couple of statistics classes I took, or maybe I'm just a computer geek, but if you asked me to generate a random sequence of heads and tails the FIRST thing I would do is open Excel.

It would be interesting to see what random sequence generator iPod and other consumer electronics use to create their shuffle mode and whether it is truly random.


So each baseball game is an independent event having neither been effected by the previous game nor having any effect on the subsequent game?

Alex Rothenberg

Well, that's the really interesting question, CK. If we suppose that each game is an independent event, can we generate a team's pattern of wins and losses that reasonably matches the data?

But perhaps teams are streaky. That is, perhaps winning one game increases the probability that the team will win the next game, and losing a game moves lowers the probability that they will win the next one. Teams move back and forth between probability states (a hot state, a neutral state, or a cold state), and their wins and losses are generated accordingly. Are the sequences of wins and losses generated from this process better fits for the data than the sequence generated by independent games?

This sort of idea crops up in Albert and Bennett's book "Curve Ball", which is worth reading.

Oh, and Steve ..... while we're on the topic of baseball, did you ever do anything else with the Moral Hazard / DH Rule? I really liked your critique of that idea.


Steven D. Levitt


My point is that, to a first order, you can adequately model the data as if what happened yesterday doesn't affect what will happen today anymore than what happened a week or two ago. It is also possible that sometimes losing yesterday helps you win today and other times losing yesterday hurts your chances of winning today. But on average, it looks like a wash.

Alex --

I haven't done anything else on DH/moral hazard.


is the winner of the world series also random? is it possible for a team to, by chance, win 100+ games in a season?
the problem with this perspective is that it removes skill and effort from the equation and leaves success to fate.

but very interesting perspective. i enjoyed reading it.


So here I stand with one foot on the burning coals and the other on a block of dry ice, to a first order approximation I am comfortable. Actually since I live near Philadelphia that is probably correct.

Scott Cunningham

I would like to argue that iPods are random with sporatic moments of being sentient, and personable, creatures. I offer two pieces of evidence for this. Twice I was running late for the bus, and on both occasions, the theme song from Shaft came on. Both times, the song came on at the corner of Lumpkin and Broad Street, and the song perfectly described what I was feeling as I was running to catch the bus.

A second time, I got up to leave a coffeeshop late one night, popped in my iPod, walked to my car and the radio turned on, blaring loudly. Not only was it the exact same song as the one on my iPod - it was synchronized perfectly with it.

Almost as if my iPod knew that that song would be on, and wanted to prepare me for the cartrip home...

Jimmy Stewart

I suspect that iPod shuffling is random, with the caveat that the song currently playing is not allowed to be selected to play next. People sort-of understand 'random', but they'd be irritated if their iPod played the same song twice in a row.


How often have you heard of a coach holding a closed-door meeting to try to turn a team around?

I think baseball managers have a fine understanding of randomness. That's why they almost always hold closed-door meetings when their ace is scheduled to pitch. That gives them the greatest opportunity to take credit for the meeting turning things around.


It is disappointing to me that a political science professor doesn't understand basic probability theory. Anything called a “science” should require its professional practitioners to have a grasp of one of the fundamental tools of mathematics.


I dare say recording the data would show the iPod is in fact random and that you a merely only noticing it when a certain song comes on as it may in fact be a personal favorite.


The iPod is not a random device; nor can any other computer algorithm beconsidered one. This is why we refere to "psuedo random number generators" in computer science. Given that we know the inputs of a statemachine, and it's state, we can always predict it's outputs. So with the shuffle function, if we new in the internal state of the iPod's processor we could always predict which song will occur next.


It is because a team that wins 35% doesn't have the same likelihood of winning every game: sometimes it has a 50% chance and sometimes a 20% chance

For those not familar with baseball, this is because of the pitching rotation. Pitchers pitch every 5 days, and even a bad team has a decent chance when their star is pitching. Playing at home also gives a slight boost.

At least, I assume that's what he meant.


For those not familar with baseball, this is because of the pitching rotation. Pitchers pitch every 5 days, and even a bad team has a decent chance when their star is pitching. Playing at home also gives a slight boost.

Even at that, if each night were independent, then you could still predict the likelihood of a streak whether the chances of winning were 50%-40%-45%-35%-25%, or uniformly 40%. Unfortunately, there isn't even a theoretical way to model true odds of a team winning, or a batter getting a hit, etc. Does a .333 hitter have a 1-in-3 chance of getting a hit? No. Any number of unmeasurable variables come into play. Maybe the home crowd gives him added motivation, or helps him concentrate better. Maybe he struck out on a nasty slider the last time he faced this pitcher, so he knows to look for it next time. Maybe the temperature change affects his performance differently as the game rolls on. Or he just performs well/poorly under pressure. You get the idea.

The same is true for a team with a losing streak. Anyone who's played sports knows you don't play each game with the same level of intensity. The Yankees will put in their best effort against their rivals (and division leaders) the Red Sox, but are more likely to have a letdown against a slumping team like the Royals. The Royals could be over-correcting their problems to break the streak and only making it worse.

Those are a lot of examples to make a simple point: you can't statistically model sporting events.



As for computer randomness, are computers capable of truly "random" numbers? The only experience I have in programming is with QBasic (back in the DOS days), but the QBasic "random" numbers weren't truly random. There were several long lists of numbers embedded into the program. In QBasic, you would have to indicate in the program which list to pick from, and it would take the next number from that list accordingly. So if you took a lot of "random" numbers from the same list, they would start repeating themselves in due time. There was a way to change the list depending on the seconds on the clock, so the appearance of randomness would be sufficient for most applications. Does anybody know if other languages or more modern computers still operate this way? Or do PC's now generate truly random numbers?


Actually, I think Dr. Levitt's calculation is not quite correct:

1. There are very few teams which are actually .350 teams. Most teams that play .350 ball (57-105 over a season) are actually better than .350, but are having unlucky seasons. This is the same logic that explains why a player who goes 1-for-5 in a game is not necessarily a .200 hitter, but more likely a better hitter who got unlucky.

If you assume a team with a .350 season is actually a .400 team, the chance of it losing 19 games in a row is actually 1 in 16,000.

2. Dr. Levitt's analysis counts only .350 teams, but not, say, .400 teams. If there are four .400 teams for each .350 team, for instance, then the chance roughly doubles: the .350 team is 1 in 4000, and the four .400 teams are collectively 4 in 16,000, which is again 1 in 4000.

3. The method Dr. Levitt used overestimates the probability of the streak because it treats the 324 chances as independent -- they are not. The math is correct -- there will be one streak ever decade or so -- but only if you count a 20-game streak as two 19-game streaks, a 21-game streak as three 19-game streaks, and so on.

That is, when the 1988 Orioles started the season 0-21, they used up three decades worth of 19-game streaks in one fell swoop.

If you want the chance of a *19-or-more* game streak, without counting these multiples, the calculation is more complicated.


All this is written at 1 am, of course, so I may be wrong!

P.S. A listing of more articles on streaks can be found at Charlie Pavitt's Sabermetrics Bibliography. Go to http://www.udel.edu/communication/pavitt/biblioexplan.htm , choose the first link, and search for "streak". Not all the articles cited are online, but many are.



Each team plays 162 games a year, so has 162 chances to start such a streak.

This is a serious oversimplification.

The trouble is that even if each game is independent, each 19 game set isn't. We have the 19 games from game 1-19, the 19 games from 2-20, the 19 games from 3-21, etc. You can't treat these as independent events, because 1-19 shares 18 games with 2-20, 17 games with 3-21, and so on.

More concretely, if game 1 fails to start a streak, it means that there exist wins within the games 1-19--most of which are shared by the set of games 2-20. Say there are 5 wins in 1-19, that means that the chance of game 2 starting a streak is *zero*.

As I recall, you have to divide the answer obtained through Dr. Levitt's method by the streak length to get the right answer for the frequency, but I won't swear to that. If I'm right, though, you should get a 19 game streak every couple centuries.



There's another interesting case of streak expectation: the SATs.

Although the test is ostensibly filled with random asnwers of A-Es, with no patterns, this is not exactly true. Over the course of a test as long as the SAT, one statistically expects to see "streaks" in the answers: columns of one letter choice. Statistics would predict several 4-letter streaks, with a high probability of 5- or 6- letter streaks.

However, the makers of the SAT specifically design the answers to avoid streaks--there are rarely streaks longer than 3, and none longer than 4. In the book "Up Your Score," the writers attribute this to the SAT not wanting to scare the students to death during the test, which I guess is the equivalent of Levitt thinking his iPod is broken when The Who come up 4 times in a row.

The point is that this irrational practice is to the test-takers advantage. By filling in non-streaky answers instead of random ones, it is possible (albeit marginally) to improve their score, since the probability of a streaky choice is less than 1 in 5.

After 50 years of SATs, one can only imagine the results if answers were truly random. Could a 19 B-streak be all too likely?



I can't believe an professor really thought his CD player had learned what his favorite songs were. And you used his full name? I must go now and tease him mercilessly on my own blog.


We would naturally think that a totally random algorithm would "sample with replacement", i.e. each song would be equally likely to be played at any given time.

This is obviously undesirably from Apple's perspective, as you note, since people don't want to hear the same song twice or even three times in a row on shuffle!

I guessed (and later verified, see below) that what Apple actually uses is a derangement. In other words, the iPod simply reorders the songs (like shuffling a deck) and then plays them in that order. If you listened long enough you would hear each song, precisely once.

The problem is that people rarely listen to playlists long enough to get all the way through each song. So people are often very surprised when 4 songs by the same artist occur in the first ten songs (for example). When in fact, that might not be all that surprising.


(Although this article seems to verify my derangement theory, I would not be surprised if there are some subtle tweaks the engineers have put in after extensive testing on people, and Apple is simply reluctant to discuss it.)