Wednesday, December 17, 2003
Part Four of The Seductions
(& Giant Sucking Sounds) of Metrics:
The Average is Not the Territory
In baseball, we use a lot of measures/metrics to evaluate players. One of the seductions I've writen about before is a widespread lust for The One Great Number (TOGN). TOGN ranges from Bill James' Win Shares to Baseball Prospectus' EqA to Tom Boswell's Total Average.
The seduction is the idea that performance can, for the sake of simplicity, be boiled down into a single number so that value (in creating rankings, or setting value during negotiation, or in assessing trades, or in granting awards, virtuous reasons all) is easily described. The giant sucking sound, though, is that while this method might make sense to a stockbroker in buying or selling equities, or a property developer in sorting out possible projects -- inanimate objects -- it's a very flawed approach for evaluating human performance in any job that hasn't been fully Taylorized (stripped down to the most simple, uniform, repetitive task).
I'm not criticizing any of these baseball metric methods. Their math seems to work very well in the general case. They deliver many results that quickly add a small bit of knowledge to our understanding of the overall value of a player's season or, sometimes, career. But just as the map is not the territory, the average is not the value.
Hiding the Truth
The single number can hide many more values than it illustrates, because individuals who aren't in a Taylor-ized job have many attributes. Mariner first baseman John Olerud's career hitting is .297/.400/.470. This makes him a high-on-base guy with some, but not very good power. That's his average career performance. And it's the average of his averages. If you boil him down to single number, you'd think he was pretty good.
The average ignores him as a hitter. The challenge is, at any given point in his career, he's never his career average. The average is not the guy.
The average ignores his context as an individual on a team. Last season, he hit .270/.370/.390, not terrible, but awful for a first baseman (a position at which teams usually hide guys without a lot of mobility but who can add a lot of juice to a team's offense). He brings a very effective glove to his position, and this isn't reflected in his average (which is another reason TOGN strategies hold attraction to people who like metrics).
The average ignores him as an individual with an evolving context in each moment. But if you break down his performances facing left-handed and right-handed pitchers, he's not near his averages against either. He's a completely different guy against righties than he is against lefties.
Rather than his overall average .270/.370/.390, against righties he's .280/.390/.420. Now he's so-so for a first baseman. Against lefties, he's .240/.320/.310, no so-so or even awful, but sub-big-leaguer. Barbara Bush. With a Wiffle Bat. In Elton John Sunglasses.
Makes one wonder why a professional organization with big bucks on the line and some chance of making the playoffs would roll him out there for 170+ plate appearances against left-handed pitchers and (it appears after 20 minutes looking though boxscores) never once pinch-hitting for him against a lefty. It's not like he a has a recent history of hitting left-handed pitching. His 2001-2003 consolidated numbers against lefties are .250/.350/.360, essentially sub-mediocre offense from a first baseman on a team that plans on winning its way to the playoffs. (Against right-handers during that time, he's a sweet .305/.410/.485)
The average ignores him in his specific role on his team. If Olerud was a shortstop or center fielder with a great glove and those averages, he might be a positive factor for a team. But again, it would depend on the context of his team. Even if we pretend the average embodies the person, its meaning changes radically based on context. What might be an acceptable level of performance for a re-building team that's conserving resources, would be wholly lacking for a team whose plan is to try to squeak into the playoffs.
It seems his team thinks of Olerud as his career average (not even his recent-performance consolidated average), and overlooks the facts that (1) he is a usually-solid batter against righties, and (2) consistently -- with a few exceptions -- hits like a minor-league back-up catcher against left-handers. The average seems to overwhelm the person it refers to. As a result, they misuse him and it undermines the team.
SMALL NOTE, WITH IRRITATION: The M's had one, lone guy on their roster who hit left-handed pitching and played first base, and therefore, would have held some hope as a platoon-partner so Olerud could sit against lefties instead of being regularly humiliated and dragging down the team. The M's traded that guy this week for a back-up outfielder whose salary is twice as high, and they shipped money to the other team for the privilege of acquiring this dude. So right now, there's no obvious option to help Olerud out.
Outside of baseball, this happens all the time.
Sales groups will develop metrics to evaluate sales-folk, TOGN schemes used to set commissions. Customer service phone bank sweat shops trying to ratchet up their miserable service to hanging-on-by-their-fingernails-acceptable service will devise a TOGN metric based on calls-processed (speed) and lack of complaints. Big military equipment manufacturers will create TOGN ranking systems for bonuses and layoff stacking.
And they all fail in assessing quality, because they look at averages only and they ignore context against the environment in which the performances that made up the average were played in.
Look, if a saleswoman bribes a purchasing agent at one single big company, and closes a giant deal, while failing on every other big company, her average sale number is going to be high. She's not achieving anywhere as much as her co-worker who sells one-eighth as much to each of eight different companies.
The phone service agent who rushes nine people off the phone politely, providing no actual service but not earning a complaint from anyone is going to look better than the agent who does that with seven people but actually takes the time to helps two of them that were easy to help, earning some repeat-business potential brownie points.
The weapons manufacturer will not be able to differentiate a steady performer who produces slightly above average every day from the burst-mode performer who has some great, some terrible days, and in manufacturing you need a blend of both to optimize production in varying demand profiles.
In general: The bigger the organization, the more diseconomy of scale, so the less efficient; the less efficient, the greater the impetus for simpler, stripped-down systems to try to redistribute energy away from qualitative tasks towards the evaporating bottom line, and the more power the seduction of the seemingly simple TOGN. The trend is has powerful allure to big organization execs, from CFOs at free-trade advocating multinationals to V.I. Lenin to Mussolini, all of whom are/were big advocates of Taylorism and the application of simple metrics, rigidly-enforced as a management technique.
Tip # 44: Don't be fooled. The seduction of a TOGN is understandable. And TOGN schemes have some value. But the TOGN is not the person; ignore that at your own risk.
In the next entry, I'll discuss a new example of the average-is-the-person fallacy, as exposed by one of baseball research's best de-bunkers and dialogue stimulators, Don "Il Postino del Destino" Malcolm.
free website counter