Here’s one of those phrases that The New Yorker would label as “sentences we never read past”:
"I was skimming the program for the annual meeting of the American Statistical Association . . ."
But really, where else can you find not only research on “Modeling Sparse Generalized Longitudinal Observations with Latent Gaussian Processes” but also on managerial strategies in baseball, parity in the NFL and the accuracy of sports predictions?
It’s striking how many statisticians who study weighty matters—how to
tell if a cancer drug works or a compound is dangerous—got their start
studying sports statistics.
“A lot of us really enjoyed baseball statistics when we were growing
up, and that’s how we got into the field,” biostatistician Michael
Schell of the Moffitt Cancer Center in Tampa told me.
So I got in touch with Jack O’Gara, who wrote the book on using
statistical techniques to spot chicanery in business (that would be the
2004 “Corporate Fraud: Case Studies in Detection and Analysis”). Now retired, O’Gara has put his statistical skills to use analyzing baseball, especially cheating.
In the business world, he focused on what he calls inflection points,
a sudden discontinuity in data. That is what he saw, galore, when he analyzed the career stats of pitcher Roger Clemens.
Clemens, of course, was named in the Mitchell Report,
which last December reported that an alarming number of baseball
players had taken performance-enhancing drugs such as steroids.
(Clemens' section starts on p. 215.) Clemens and his camp deny it.
O’Gara decided to see if stats could tell us anything.
One of the most telling is ERA Margin, which compares a pitcher’s
earned run average in a given year to the league average. It’s more
informative than ERA alone because it controls for weird things like
hitters league-wide being in a slump (which would reduce every pitcher’s
ERA but not ERA Margin), or the use of a juiced ball that year, which
would raise pitchers’ ERAs but, again, not the margin. The ERA Margin
tells you how one hurler is doing compared to his peers.
O’Gara compared Clemens’ ERA Margins to those of the 20 post-World
War II pitchers with the most wins, turned in by legends such as Warren
Spahn, Tom Seaver and Bob Gibson. Through age 34, Clemens’ margin was
1.09, notably better than the others’ 0.6. Fine, the guy was an ace.
But from age 35 to 40, when most pitchers fade, Clemens’ margin was
1.18, compared to 0.43 for the other greats. Here's where it gets weird:
from age 41 to 45, it was 1.30, while the others’ was a negative 0.01.
That is, the other great pitchers’ margin shrank as they got older,
falling more in line with the league average and normal aging patterns,
but Clemens’ soared. As O’Gara put it, “Clemens is the only pitcher who
gets progressively better as he ages into the post-40 category.”
When the ERA Margins for baseball’s top 10 or top 20 pitchers each
year is graphed, Clemens is better than the rest when he was 29 and 30,
then twice more—three performance peaks while none of the top 20 had
more than two. “More significantly, the second two peaks were higher
than the initial peak, which occurred in the presumed prime of his life,
contrary to normal aging patterns,” O’Gara says. “At age 43, Clemens
had the seventh-best season [measured by ERA Margin] since World War
Of the 20 best ERA Margins since 1945, all
came when the pitcher was 34 or younger (average age: 28), with the
exception of Clemens, who did it when he was 35 and again when he was
43. The best two-year average ERA Margins cluster when pitchers were in
their late 20s (Sandy Koufax: 29-30; Greg Maddux:
28-29), and again Clemens’ best coming when he was 43-44 stands out.
Clemens’ ERA margin at age 43 was the best in the majors that year and
the best-ever for a 43-year-old.
Testimony taken for the Mitchell Report and
given to Congress this spring included accusations from a trainer that
he injected Clemens, which the pitcher denies. As it happens, the three
periods when the trainer said he administered shots “correspond to
performance bursts by Clemens,” says O’Gara. “The ERA for these three
periods totaled 1.92 over 183 innings, significantly better than his
career average ERA of 3.12.”
As has been widely reported, in 1996 Clemens,
then 34, was coming off a sub-par 1995 season and struggling through
the first months of the '96 season, his last of 14 with Boston. “Then he
suddenly went from being mired in the worst multiple year performance
of his career (the preceding one and 2/3 years) to his best
two-year-plus performance of his career,” says O’Gara. “He averaged a
2.91 ERA margin for the remainder of 1996, better than for any single
One baseball statistician I asked about this
analysis warned me against “guilt by graph”—that is, concluding that
someone was juiced based on stats alone. “Stats can tell you if
someone’s performance is unusual, but by definition a great player has
an unusual performance,” he said. See, for instance, this post by another stats guru.