Management by Baseball

Thursday, December 18, 2003

The Average is Not the Guy
Malcolm un-Gilds the Lilly

In the last entry, I explained about the average not being the guy and how seductive it is, especially for time-constrained managers to jump on a TOGN (that may be completely valid and useful in its way), and mis-apply it by using it as a stand-alone Truth. That is, if you fall in love with the average, you're at risk of missing key details that provide context and that may lead to to mis-appraise performance.

A baseball case in point looks to be the Toronto Blue Jays' trade of outfielder Bobby Kielty to the Oakland A's for their #4 starting pitcher, Ted Lilly. Kielty disappointed in Toronto last year, and had what many term a "sophomore slump", a performance downturn in the second year that follows a strong first year. (Some commentators like to point this out as the common case, the norm, and it happens a lot, but it's overhyped because once the pattern is placed in the mind and given a name, whenever it pops up, its more noticeable. The words "sophomore slump" then take the place of analysis; they just use that and don't have to try to understand why).

The Jays and As are both run by general managers who are adherents of sabermetric analysis (the Jays' G.M., J.P. Ricciardi worked for Billy Beane and the A's), so this is a real inside deal.

On the surface, you have Kielty coming off a year that surprised many with how ordinary it was, and Lilly coming off a year that was quite good for a #4/#5 starter, a good stretch performance, and a top-10 performance in strikeouts per 9 innings pitched. On the surface, you'd have to say the Jays got the benefit of probably-good, and the As look like they're taking a chance.

According to Dave's swell Baseball Graphs site where he presents Win Share numbers for 2003 players, Lilly is a 10 Win Shares guy and Kielty a 12 Win Shares man. If you use Win Shares (a TOGN with lots of intelligent research behind it) as a measure, the deal looks pretty balanced

Unless you read Don Malcolm.

Don, co-headman of the (formerly published in book form) Big Bad Baseball Annual, is a natural skeptic and contrarian. In analysis, that means he always tries to look at things from an angle others overlook, and his natural thinking/analysis pattern is to dispute what others believe, especially if he suspects they've just latched on to an idea without examining & questioning it seriously. His writing is as punchy as any on the web, his analysis systems and observations interesting and context-sensitive. As I've mentioned before, he's the only analyst outside of Flordia who suggested before the season the Florida Marlins (now we can say World Champion Florida Marlins) would be a very competitive team this year.

One of his latest entries (you have to be dedicated to read his blog...sometimes entries are frequent, sometimes sparse) is on the Kielty-Lilly deal, and his analysis, again, goes to a zone others haven't. It's a beautiful illustration of how the average is not the guy.

He parses 2003 data for each player between stats accumulated against "good" teams (.500+ records) and "bad" teams (below .500).

He first looks at Kielty's performance:

When we do that for 2003, we receive what is certainly a curious finding:
Opp    AB   R   H   D  T  HR  RBI  BB  SO   BA   OBP   SLG
.500+ 210  40  56  15  0   8   36  45  46 .267  .396  .452
.500- 217  31  48  11  1   5   21  26  46 .221  .305  .350
In 2003, Bobby Kielty suffered most of what Brock Hanke would call his “sophomore slump” due to the fact that, for whatever reason, he couldn’t hit against pitchers on sub-.500 teams.

This is an unusual split, of course, because hitters as a whole hit worse against good teams than they do against bad ones. That trend can be seen in Kielty’s GvB splits for 2002:
Opp    AB   R   H   D  T  HR  RBI  BB  SO   BA   OBP   SLG
.500+  71   8  19   3  0   0    4   7  17 .268  .333  .310
.500- 218  41  65  11  3  12   42  45  49 .298  .418  .541
<snip>We focus on major league performance and introduce something missing from all other stat splits: quality of opposition.

Perhaps the risk is not so high. If you look at his record against only the better teams and suggested he would hit close to that against the poorer ones, you could believe his postential is higher than his 2003 stats. If it's common for batters to hit as well against below-average competition as they do against good competition, then it wouldn't be far-fetched for Beane to think Kielty might recover normal performance again lesser teams, which would raise his performance to something more like the .267/.396/.452 he put up against the better teams. It isn't a guarantee, but nothing is guaranteed with human performance in dynamic systems.

How about Lilly, though? Wasn't he really useful last year? Back to Malcolm:

Following the current conventional wisdom for evaluating pitchers, (Rob) Neyer focused only on Lilly’s walks, strikeouts and ERA and suggested that he will be a good #3 pitcher for the Blue Jays.

Of course, we have more tools at our disposal for analyzing starting pitcher performance, and this is as good a time as any to use ‘em on Lilly. The most interesting breakout is, again, one that shows his GvB splits:
Opp   IP    H   R HR BB SO  ERA    S    C  W  L 
.500+ 90  110  66 16 38 85 6.28  4.8  3.6  3  9 
.500- 87   68  26  8 20 61 2.38  3.0  2.6  9  1

There are some Big Bad Baseball-specific measures in here I won't explain. If you're interested, go to his site where he explains them. I find them perceptive and stimulating. But to give you the nickel tour, Lilly pitched 90 innings against teams that played .500 or better ball, and 87 against teams that didn't. He gave up almost twice as many homers per inning to the better teams, with an ERA over twice as high. His record was a scary 3-9 against good teams, 9-1 against the lesser teams.

This doesn't mean Lilly is trash; you have to beat the bad teams as well as the good. A creative manager could optimize Lilly's value in the regular season by lengthening him a day here, shorting him rest a day once in a while, so he could face a higher proportion of weak teams. Of course, once you get to the playoffs, he's not going to be very useful, since, by definition, playoff teams are almost certainly .500+ teams. And since Toronto is in the AL East with both the Yankees and Red Sox, that means they're going to see a higher-than-average proportion of good teams they really need to beat to make the playoffs.

A little more Malcolm on Lilly and the As:

In Lilly’s case, we are clearly looking at what Sandy Koufax called “the Jekyll-and-Hyde” syndrome. The term he used in his autobiography, in fact, was the “.500 pitcher.” This pitcher always seems better than he really is, because his good games are so good. In Lilly’s case, all of his really good games—and this includes all of the games he pitched last September—occurred against sub-.500 teams.

For some additional context, let’s take a look at the GvB performance records of the starters for the Oakland A’s in 2003:
Pitcher  Opp   IP    H    R  HR   BB  SO   ERA   W   L  TW  TL
Harden .500+   19   22   17   4   12  20  8.20   0   3   1   3
Hudson .500+  135  107   47   7   30  85  2.40   9   4  15   4
Lilly  .500+   90  110   66  16   38  85  6.28   3   9   6  11
Mulder .500+  118  123   46   7   30  80  3.43   9   6  10   7
Zito   .500+  114   98   60  13   47  78  4.20   7   7   8  10
Others .500+   52   82   47   9   20  38  7.31   2   7   4   8
OAK    .500+  527  542  283  56  177 386  4.37  30  36  44  43
As you can see, Tim Hudson had a fabulous record against good teams in 2003, and as a result was even more important to the A’s ability to overtake Seattle than what has already been recognized. Overall, however, the A’s starters were sub-.500 against good teams (30-36); their bullpen (14-7) is what got them over the .500 mark against them. And much of that was due to their ability to win all of Hudson’s non-decisions: the A’s were 6-0 against good teams in those games, and just 8-7 in all of the others.

Lilly was not hit quite as hard as the various pitchers used to populate the #5 slot in the rotation (John Halama, Rich Harden, et al), but he is far off the performance level of the A’s big three.

Pitchers who consistently pitch this poorly against good teams rarely develop into anything other than back-end rotation fillers. The Blue Jays found this out last year with another ex-Oakland pitcher (Cory Lidle), and they’ll almost certainly find out the same thing in 2004 with Lilly.

<snip>Lilly was so good against bad teams that it tends to color the overall perception of his performance, especially those crafted by limited (and, frankly, downright evasive) analysis such as the one provided by Neyer in his recent column.

It may not happen that Lilly folds up like a cheap card table against better teams in 2004. He has a better defense behind him in the outfield now. And predicting how a player will adapt to a new park (Toronto's generally an offense-promoting park) is an inexact art. But the deal looks more balanced than many analysts have given Beane credit for.

How can you slice and dice your team players' performance to uncover useful insights about them? Does Malcolm give you some ideas?

12/18/2003 08:46:00 PM posted by j @ 12/18/2003 08:46:00 PM