Analyzing Roger Clemens: A Step-by-Step Guide

February 11, 2008

Yesterday, I posted about the conclusions that Eric Bradlow, Shane Jensen, Adi Wyner, and I drew from analyzing Roger Clemens‘s career statistics. I thought that it might be useful to show how we got from the findings in the Clemens Report (exonerating him), to our somewhat opposite conclusions. So for budding forensic economists, here is a step-by-step guide, with pictures.

1. The Raw Data

The “Clemens Report” mainly analyzes his earned run average through time. These numbers appear to show no reliable pattern, as they bounce around a lot from season-to-season. At this point, it is hard to see any particularly interesting pattern in the data.

2. An alternative metric, and a fitted curve

The problem with analyzing ERA is that it is affected by a lot of things beyond pitching quality. For instance, defense affects a player’s ERA, and poor pitching is not much impacted if there happen to be no runners on base. Instead, we turn to a more reliable metric – walks plus hits per inning pitched. This metric yields less “bounce,” and a more reliable pattern is revealed. Fitting a curve, we find that Clemens’ performance deteriorated for about a decade, then started to improve for the last decade of his career.

The turning point appears to be at around the age (36-37) in which the Mitchell report suggests he used performance-enhancing drugs. When we analyze other summary measures of his pitching performance, we see a roughly similar pattern, although some look more suspicious and some less suspicious.

3. Creating a Comparison Group

To figure out whether Clemens’s performance is unusual, we needed to compare his career trajectory with other durable pitchers. The Clemens report analyzed Nolan Ryan, and this was a wise choice: Ryan’s performance also improved in the final decade of his career.

But a useful comparison group should involve many other pitchers who have also had long and successful careers. When we examine all 30 other pitchers who, since 1967, have started in at least 10 games in 15 seasons with 3000 innings pitched, we see a pervasive pattern: nearly all of them improve for about a decade, and then their performance deteriorates in the second half. The exceptions to this rule are those pitchers who simply tend to simply get worse through time – and this looked to be Clemens’s trajectory until his mid-30s.

But overall, Clemens’ path looks “upside-down,” as he gets worse first, and then improves later.

4. Clemens’ Career Versus Other Pitchers

We fit a curve that describes the typical career of a durable starting pitcher. Think of this as being like the “control group” in a medical study. Clemens’s career arc looks very different than our control group, suggesting something unusual occurring.

Unfortunately, our statistical analysis cannot pinpoint the precise cause of this unusual pattern. But it is clear that the Clemens report stretches credibility in arguing that his late career was typical. His late-career performance certainly was quite exceptional given the trajectory that he was on in the first half, suggesting that close scrutiny is warranted.

Search the Site

Analyzing Roger Clemens: A Step-by-Step Guide

Comments