If you wish to win in baseball, your team has to spend money. Just look at the New York Yankees. USA Today reports that in 2012 the Yankees led the American League in spending. And the Yankees finished with the best record in the American League.
Of course, one data point doesn’t a trend make. What do we see when we look past the Yankees?
Here is a simple plot of winning percentage in baseball in 2012 and team payroll:
As one can see, the regression line – the positively sloped blue line — indicates that higher pay leads to more wins. At least, that’s what we see when we just stare at the line.
When we look at the actual estimation of the line’s equation, though, we note one very important issue. The link between payroll and winning percentage in 2012 is not statistically significant.* In other words – despite what we see when we stare at the line — we can’t argue that payroll and winning percentage in 2012 are actually related to each other.
When we look at the actual data, it’s easy to understand why we get this result.
- The combined payroll of the Washington Nationals (20^{th} in payroll) and Cincinnati Reds (17^{th} in payroll) – the teams with the best records in baseball this year – did not equal the payroll of the Yankees (3rd best record this year, 1^{st} in payroll).
- The Philadelphia Phillies were 2^{nd} in payroll (only $23 million behind the Yankees) but ranked 16^{th} in winning percentage.
- The Boston Red Sox were 3^{rd} in payroll but tied for 24th in winning percentage
- The Oakland A’s were 29^{th} in payroll but tied for 4^{th}in winning percentage (only one game behind the Yankees).
- Of the top 5 teams in payroll, only two (Yankees and Tigers) made the playoffs. Meanwhile, five playoff teams ranked in the bottom 50 percent of the payroll rankings.
Given these anecdotes, we should not be surprised that payroll and wins were not statistically related in 2012. What may be surprising is that this isn’t what we typically see when we look at payroll and wins in baseball.
USA Today reports team payrolls from 1988 to 2012 – across these 25 years, a team’s relative payroll (team payroll in a given year divided by the league average payroll in that season) does have a statistically significant relationship with a team winning percentage. So across all the years for which data exists, payroll and wins are statistically related.
We should note, though, that explanatory power is somewhat low. Only about 17 percent of the variation in team winning percentage (i.e. the R^{2} from the equation) across the past 25 years is explained by a team’s spending on talent. So much of the variation (specifically, more than 80 percent) in winning percentage is not explained by payroll.
That being said, there is a statistical relationship. More spending seems to get you something.
At least, that’s the story you tell if you look at all 25 years. When you look at each year individually – as the following table illustrates — the power of team spending seems to vary quite a bit. Again, in 2012 we see a relationship that is not significant (NS).
Year |
p-value |
r-squared |
2012 |
NS |
NS |
2011 |
0.01 |
0.17 |
2010 |
0.04 |
0.13 |
2009 |
0.02 |
0.21 |
2008 |
0.06 |
0.10 |
2007 |
0.00 |
0.25 |
2006 |
0.00 |
0.29 |
2005 |
0.00 |
0.24 |
2004 |
0.00 |
0.29 |
2003 |
0.02 |
0.18 |
2002 |
0.01 |
0.20 |
2001 |
0.04 |
0.10 |
2000 |
0.04 |
0.10 |
1999 |
0.00 |
0.50 |
1998 |
0.00 |
0.47 |
1997 |
0.01 |
0.22 |
1996 |
0.00 |
0.34 |
1995 |
NS |
NS |
1994 |
0.07 |
0.16 |
1993 |
0.09 |
0.09 |
1992 |
NS |
NS |
1991 |
NS |
NS |
1990 |
NS |
NS |
1989 |
NS |
NS |
1988 |
0.00 |
0.18 |
From 1996 to 2011, though, payroll and wins were statistically linked each and every year. However, explanatory power varied. From 1996 to 1999, explanatory power was above 30 percent in three seasons and reached 50 percent in 1999. But after 1999, explanatory power never reached 30 percent again.
If we look at baseball before 1996, we see five seasons where the relationship was again not significant. And when it was significant, the relationship was never that strong (always below 20 percent).
So here’s the big question: Why is this relationship not stronger? One would think that as teams spend more they see more wins. But often, that’s not what we see in the data.
One issue – as various academic studies have indicated — is that a baseball player is only paid what he is worth in the free agent market. And a player can’t be a free agent until he has played six years. This means that players with less than six years of experience – such as Austin Jackson of the Detroit Tigers – can be far more productive than their salary would suggest. This past year, Jackson was paid $500,000 while producing 3.3 Wins Above Average (WAA). Meanwhile, Prince Fielder produced 2.3 WAA and was paid $23 million. Why was Fielder paid so much more? Fielder just sold his services in a free agent market while Jackson was still working in a labor market where only the Tigers could pay for his services.
Beyond the labor market issues, there’s the issue of forecasting performance in baseball.
A player’s salary negotiated today is a reflection of what a team thinks that player will do in the future. Unfortunately, baseball performance is difficult to forecast. This is especially true for pitchers, where the interactions between the pitchers’ performance and the skills of the defenders around the pitcher are difficult to measure.
Given these two issues, it’s not surprising that payroll doesn’t explain much of wins in baseball.
But does the result from 2012 indicate that payroll and wins will not be statistically related in the future? Again, one data point doesn’t a trend make. So we don’t know if the big spenders will struggle again in baseball in 2013. What we do know is that spending doesn’t guarantee a team success in the regular season.
As for the post-season? Well, there are right now eight teams left in the playoffs. And the Tigers — the team I follow — are one of these teams. Unfortunately, there’s a very good chance that I am going to be unhappy in a few days.
Thus is the nature of the playoffs. All but one team walks away sad. And it seems, no amount of spending by your favorite team can change that reality.
* – the p-value on the coefficient for payroll in 2012 is 0.26. Typically we argue a coefficient is statistically significant if the p-value is 0.10 or lower (and some insist on values of 0.05 or lower).
detroit is in the top 5 for payroll??- i dont do baseball, but i would guess that the long drought of tiger playoffs (88-05) coincided with lower payroll- and, how r they spending that much?- i thought detroit was broke…
I’d love to see analysis on any relationships between the amount spent on scouting resources vs. wins over a 25 year period.
When you test multiple sets of data for the same effect, you are supposed to adjust your p-value thresholds. You should probably have a chat about statistics with your fellow Freakonomics blogger, Mr. Mahajan.
If we’re going to do some non-rigorous analysis, though, I’ll throw in that the slope of a linear regression to the statistically significant subset of r-squared values is ~-0.004, i.e. relative salary explains roughly 0.4% fewer wins per year.
Given a roster, ignoring injuries, the results for the season include the effects of significant random fluctuation (as per flipping a coin.)
Serious injuries add a significant additional randomness to the results.
These would reduce the explanatory value of any variable, including payroll.
The Phillies were missing $35 million worth of middle infielders for the first half of the season and were about 10 games below .500. After Utley and Howard came back they were 10 games above .500. If you extrapolate that winning percentage over an entire season, there’s a good chance the Phillies go to the playoffs.
Winning percentage may not be the best way to judge how useful a high payroll. Look at championships instead. The big 3 teams (Yankees, Red Sox and Phillies) have all won championships recently. The Rangers (6th in salary this year) have gone to the last two World Series.
There are 9 teams with salaries over $100 mil this year: Yankees, Phillies, Red Sox, Angels, Tigers, Rangers, Marlins, Giants, and Cardinals. Of the 6 different teams that played in the last 5 World Series, only 2 are not in this group.
I suspect there’s a time lag effect with salary and team success. So if a team has a good season, the players’ salaries will go up over the next couple years as those guys get new contracts. If the team retains its players, it means it’s paying more for the same guys. So low salary teams tend to go in cycles of good years and bad years. The high salary teams can stay competitive every year. The Yankees are a good example of this.
Championships is a terrible way to measure this stuff because you have a very small sample size influenced by much more random results (seven game series wins versus 162 game season). Then you looked at this year’s salary and compared it to playoff results from five years ago?!? That’s strange.
I think your just trying to find a pattern that supports your view and you found one with the data you’ve picked.
Hey Prof. Berri. Long-time fan (I love your books), first-time commenter. There’s a very obvious problem with regressing salary on winning percentage like this. The teams are separated into two leagues, and this selection is not orthogonal to salary expenses. Each team is competing (mostly) within its own league for wins. The two leagues have roughly the same aggregated winning percentage (interleague play makes this “roughly” and not “exactly”), but the average NL team has only 54% of the salary as the average AL team (using numbers from your link and computing means across the 14 and 16 teams, respectively). To illustrate this example more extremely, we could add minor league teams at all levels into your regression and still witness that expenditure isn’t a significant predictor of winning percentage! We need to look at a team’s expenditure relative to its own competition.
I’m not arguing that your point is wrong (especially looking at Oakland, Baltimore, and Tampa Bay this year), and I’m certain you can make a good argument in a variety of ways. But this is definitely not the best way to demonstrate your argument statistically. Perhaps you didn’t actually do it like this; the post is kind of lacking in detailed regression specifications and results.
54% seemed way different to me. I put the salary data into a spreadsheet as see the average AL salary is 105 million vs. NL 92 million. So the average NL salary is 88% of the NL salary.
The medians are 88 mil (AL) vs. 86 mil (NL).
Good catch. I also thought it seemed wrong, but figured that having the top teams pulled up the mean disproportionately. Some digits were truncated in my spreadsheet when trimming leading characters, sorry about that. But the point is still there, that the NL salaries are much lower, and these AL teams with high salaries (5 of the top 6) are playing more games against each other than against the lower salary NL teams. We can’t look at two leagues that mostly don’t play each other as part of the same sample without accounting for the fact that the independent variable isn’t randomly distributed between the leagues.
Dave Berri said
“USA Today reports team payrolls from 1988 to 2012 – across these 25 years, a team’s relative payroll (team payroll in a given year divided by the league average payroll in that season) does have a statistically significant relationship with a team winning percentage. So across all the years for which data exists, payroll and wins are statistically related.”
I got a much higher R-squared. Maybe because I took each teams average wins per season and the average relative salary. See this link
http://cybermetric.blogspot.com/2009/11/did-yankees-buy-world-championship-in.html
I took data from JC Bradbury’s site.
The data shows how many games, on average, that teams won each year from 1986-2005. It also shows how much above or below the league average in total salary each team paid in percentage terms. Again, it shows yearly averages. Suppose a team was 10% above average one year and 30% above average another year, they would get 20 (if were just over two years).
What I did was to run a regression with average wins per year as the dependent variable and the average salary (SAL, the % above or below the league average) as the independent variable.
Here is the regression equation
Wins = 0.157*SAL + 80.22
The r-squared was .489
Here is the link to Bradbury’s data
http://www.sabernomics.com/sabernomics/index.php/2006/11/payroll-and-wins-2/
To properly examine the issue, you have to look at each team individually and track their success compared with their payroll. What you will find is that while the Red Sox may have been a disappointment this year, they have been a consistent presence in the postseason picture for as long as their payroll has been one of the highest. Meanwhile Oakland and Baltimore are in this year, but have not had nearly the consistent talent that the big payroll teams have.
If money spent truly has no effect on the number of wins, why do teams continue to push the envelope and spend more every year and why do would they do it in a single year (other than a small bump in attendance from high-prestige players)?
This isn’t a small amount of money – it’s 10′s of millions of dollars. Why continue to pay that and then watch salaries spiral up and become even more expensive in future season if the returns aren’t worthwhile?
There are other reasons that a team could have a high payroll. Players have value beyond wins – e.g., some players are popular with the fans even if they are not the greatest contributor to wins. Fans also need to see that their team is trying, and payroll impacts that perception. Revenues are also not tied only to wins – big market teams with big TV contracts means that you have more to spend, regardless of winning.
Finally, not all sports ownership groups are in it for profit. Some teams are just toys for owners to play with and show off to their friends. They are status symbols, which has economic value in itself, so owners are likely to spend more than they will actually get in return economically.
I accounted for “prestige players”.
Just because a team brings in more revenue from TV, doesn’t mean they have to re-invest it in player salary. Why wouldn’t they see that as profit. With the exception of a couple of teams, each baseball team has a monopoly over a geographic area don’t really have competition in that sense
Name an owner that is ok with losing $10s of millions a year.
Isn’t the real question whether or not spending on players increased a team’s revenue? We know winning does, but does pay roll?
The plot seems to tell the following story: Money doesn’t buy wins, it just keeps you from being awful (win% < .400.
Looking at a single year will only provide a limited number of data points, meaning a Type 2 error is highly likely (i.e. we say there is no effect when there is) due to a lack of statistical power and only a strong relationship has any chance to be detecte.
Pooling data from across multiple years provides more data and allows the weaker effects to be seen as significant. A simple regression is probably still a bit naive, as I suspect a team that won last year has a pretty good chance of winning next year, even allowing for regression to the mean.
Finally, an explained varience of 17% just for money is pretty big for a single factor, unless you think things like coaches are not a contributing factor as well.
I have been curious about the impact of anti-drug measures on scouting and recruiting. Ten years ago, performance was more consistent, injuries had less impact, etc., because everyone was using steroids. Now, it is harder to predict when big injuries can take a team down, or when a younger player might be better than an older player. I am not sure if this is a real issue, just something I am wondering about.
I don’t think this lends itself to this sort of analysis at all.
First off, player contracts are shifty — front-loaded, back-loaded, incentive-laden. The payroll is not totally indicative of the teams’ investments in a given year.
Second, as someone said, injuries are just huge. If you are the Yankees, yeah, you can pitch hit for ARod with Raul Ibanez and watch him hit two HRs to win.
If you are the usual team, you have some superstars, but you don’t have sub-superstars behind them. That is where I think the real payroll benefit is. The frontline guys are usually all good enough to win, and a team built on free agents is also a team built on older players more susceptible to missing playing time. But if you can just go all-in and have an all star bench as well, you are going to have a great shot.
A comment about the film and book Moneyball. Great film, but the entire premise is slanted, The reason those As team oversucceeded were not their walking first baseman, it was the cheap young starting pitching — Hudson, Mulder and Zito. Give any squad a top three SPs who are cheap young studs and they will compete. And those three names are almost literally unmentioned in the book or the movie.
One thing to keep in mind is the fact that nobody measures success in baseball by using regular season winning percentage. The only two things that matter are making the playoffs and winning the World Series. For example, since the playoffs were expanded in 1995, the Yankees have used their huge payroll to make the playoffs every year except 2008. But they’ve only made two World Series appearances in the past ten years, losing in 2003 and winning in 2009. During that time period, the ten World Series winners have had the 8th, 5th, 1st (2009 Yankees), 5th, 1st, 13th, 2nd, 3rd, 7th, and 4th best regular season records, mainly because the short series length of the MLB playoffs introduces a high level of randomness. For example, in 2003, the Yankees lost the World Series to the Florida Marlins, a team that finished ten games behind them in the regular season with a payroll roughly a tenth of the size.
A second thing to keep in mind is that $$$ does not always buy quality. Just like a $500 bottle of wined does not taste ten times better than a $50 bottle, a $20 million/year player does not generally produce four times more than a $5/million dollar a year player. The teams with the highest payrolls may spend the most money, but that does not necessarily mean that they end up with the best players, or that it makes them a better team.
What money does seem to buy is a greater chance at consistent regular season success. But consistent regular season success does not necessarily equate to World Series titles.
This article appears it may be guilty of the fallacy of compositions. With the noise that is evident in baseball wins and the separation into leaugues… Even if each year shows a non-significant relationship between payroll and wins it does not follow that one cannot argue that wins and payroll are linked for those years. Moreover each yearly sample is not an independent test and it actually follows that multiple low but “non-significant” results provides strong evidence that payroll and wins are linked. The author showed only that if one knew only of 2012 that would be a non-trivial probability that winning and payroll were not linked… This is hardly meaningful as if one looked only at the AL west in 2012 it may be concluded that payroll and winning were inversely linked!!! This is an interesting case of the fallacy of composition.
Players are also paid based on how many tickets they will sell and TV viewers they will draw. A big power hitter like Prince Fielder may be worth more in ticket sales than an equivalent player that has a high WAR as the result of singles and walks. Baseball teams are not necessarily in business to win, but are in business to sell tickets.
I wrote my diploma thesis on this topic. I ran a linear mixed model (10 MLB seasons from 1999-2008) and regressed winning percentage on payroll, pay dispersion within each team (measured by the GINI coefficient), Fan Loyalty (measured by yearly attendance) and team consistency (measured by the percentage of returning players each season).
Once I controlled for pay dispersion for each team the coefficient for payroll turned not significant (p =.626) . However it was positive. The coefficient for pay dispersion was negative and significant (p = 0,045). So from a statistical point of view it is more important how you disperse your payroll within the team, as homogenous payroll structures lead to a higher winning pct.
Team consistency was also significant and positive (p=.076).
So I think when looking at the link between payroll and winning percentage one should include more control variables and use a larger set of longitudinal data to avoid misleading results.
yes