Thursday, September 30, 2004

Predictive Playoff Metrics: Tom Tippett,
Components & the Craft of the Small Project  

In the last entry, I talked about a ridiculous attempt at cobbling together ill-considered measures to predict the outcome of playoff series. Any kind of prediction of a short series of games (five or seven) cannot be made, in itself, a science. You can, however, use measures to improve your chances of differentiating the very skilled and a little lucky from the somewhat skilled and very lucky.

Beyond baseball, managers have to do this all the time, what I'll call "the Playoff Challenge". You have a small, quick-turnaround project or effort, you're going to make almost as many decisions in a short period as you would in a fuller, longer project. Successful efforts and projects have a fixed overhead of planning and design and forethought and set-up and clean-up decisions to make, and this foundation is not going to be much different for a quick effort than it is for a larger one. A manager will make more decisions with fewer chances for mid-course corrections. Good decisions are just as necessary, but not as valuable, because the living, operating moments of that good decision are shorter-lived and have less time to accumulate advantages. On the plus side, though, the mediocre ones hurt you for less time, too.

Learning how to use historical information, but also to keep light on your feet to adapt to the changes whizzing by on a short endeavor is a challenge much like a team trying to win a short series. A lesser team has a better chance of upsetting a superior opponent over a 7-game series by beating them 4 times than they do in a 162-game series by beating them 82 times. So measuring overall quality will certainly be a factor, but not the factor.

So how do you isolate out smaller factors that might have disproportionate impact on a short series that will take place soon?


Tom Tippett, the creator of Diamond Mind Baseball, the most sophisticated stat-based simulation of baseball on the field, posted on his weblog in a 9/21 entry called "And down the stretch they come..." some data to help people try to think about which teams might have the best chances for success in the playoffs and World Series. Tippett's work is figuring out how to create mathematical models that reflect reality as closely as possible, so it's never the clean-room Ivory Tower math that is always so interesting to read but usually lacking in context and applied torque. His stuff, when he presents it, is among the most interesting and the most practical sabermetric research we can get our hands on.

Here's his intro:

Two weeks from today, the first pitch of the postseason will be sent plateward, and while we don't yet know who'll be on the mound or in the batter's box at that moment, it's almost certain that the series in question could go either way. That's always true of any short series, but it seems especially so this year, with all of the leading playoff contenders riding a wave of success since the trade deadline.

In an effort to gauge the quality of the teams most likely to survive the regular season, I decided to take a look at how they've performed since the trade deadline. Some of the contenders made important changes to their team at that time, so their records since the deadline might be a better indicator of the quality of those teams than their overall records.

Like other sabermetricians, Tippett applies a momentum model. A general test many use is Win-Loss records in the second half of the season. Based on his own number-crunching, he tweaked his measure of recent-record from second half to the 7 weeks from the July 31st trading deadline to when he wrote the piece. ¿Why July 31st instead of the beginning of the second half, around July 1? Because contending teams tend to be active in trading for complementary pieces in the weeks leading up to that July 31 "deadline" (it's not such a rigid barrier; creative general managers can still make significant moves after that, but its administratively simpler before and because many teams act as though it's a hard barrier, many make the moves before then anyway, reducing options for patient G.M.s, making even the patient ones have to pander some to the panic). So the roster a team takes into the playoffs will be a closer to the post-July 31 roster than the roster it carried in July, and these moves can affect teams for good and ill.

So Tippett's approach, measuring the win-loss record of the team as close as it can to being the team it will take into the playoffs is a good way to trim data points and focus down on the more critical. But win-loss records, while they measure actual success, may hide some part of a team's quality because luck plays a little factor, and the smaller the run of games that makes up the record, the more likely there will be some drift, some bit of hiccup between the actual win-loss record during that stretch and the components of winning.

Tippett, like many sabermetricians, quantifies measures of winning, which can also indicate relative strength and potential for future winning. Tippett's measure is Net Runs, the runs a team scored minus the number they allowed. Runs, of course, are results themselves, results of other components, in this case bases earned through hits and walks. So he uses a third metric, TBW (total bases plus walks), and creates another measure for each team, Net TBW (what they created minus what they yielded).

By breaking building blocks into smaller sub-assemblies, he builds a foundation for measuring overall quality using multiple tests. His table for the contenders as of September 21 looks like this:

American        W   L   Pct   GBL  Runs   TBW
Boston         33  12  .733         +86  +222
New York       28  17  .622   5.0   +39  +130
Minnesota      28  17  .622   5.0   +41  +129
Oakland        28  17  .622   5.0   +18   +91
Anaheim        27  17  .614   5.5   +41    +3

National       W   L     Pct  GBL Runs   TBW
St. Louis     30  14    .682       +53  +165
Houston       30  15    .667  0.5  +50   +82
Atlanta       30  16    .652  1.0  +55  +114
San Francsco  27  16    .628  2.5  +62  +197
Florida       26  17    .605  3.5  +41   +26
Chicago       25  17    .595  4.0  +36  +119
Los Angeles   25  20    .556  5.5  +29   +53

The Tippett analysis is flat-out interesting. The won-lost records over the seven weeks show that none of the teams with a chance at the playoffs is lugging -- all have good records. The Cards had the best record in this time slice, just as they would the rest of the season if you removed this set of games from the sample.

In the American League, the won-lost record lines up really smoothly with the TBW, and reasonably well with Net Runs, with Anaheim clearly overproducing runs relative to their components. In 2002, the Angels who won the World Series had underperformed most of their component measures Here, for the Angels and As to compete with the others, they'll need more than their share of good bounces and breaks, and while in a short series that can easily happen, chance tends to even out for most teams.

In the National League, there are more seams, some bigger differences between actual records and components. The Giants have the best Net TBW by quite a bit, the best Net Runs by a noticeable step, and a good record, but not quite as good as three other very good teams. The senior circuit's match-ups, as measured by momentum and components of recent efforts, is pretty even, though the Cards and Giants are the leaders.

Analysis can't end here. There are a lot of other factors you want to stitch into your judgement of which is the better team in any individual match-up.

Team tendencies: Some teams have a strong ability to beat up on pitching of one kind or another (left-handed, or sinker-slider, or power pitching, or ambidextrous). Others look feeble against that. Since most human systems are self-amplifying, organizations tend to produce (and sometimes even go to the trouble of acquiring) players of similar aptitudes over and over, sometimes adhering to the pattern over decades (Dodgers at 3rd base, Athletics at shortstop from about 1901 until Bert Campaneris). Boston's home park favors hitters who bat from the right side, so the team will load up with right-handed pull hitters so you don't want to put up too many lefty pitchers. The patterns are well-known even when they aren't true; it's an automatic thought that you throw left-handed pitchers against the Yankees because they build their team around batters who hit from the left side to take advantage of the reachable right-field fence, so lefty pitchers have the potential to do a better job of supressing their left-handed hitters. This year, that isn't true; the team is very balanced, with a microscopic statistical edge against left-handed pitchers (their acquisition of Gary Sheffield means the Bronxians' marquee offensive player is right-handed this year), tweaking the chemistry. But it goes ebyond "teams" because at key junctures, the smaller components, the players, will decide outcomes.

Individual tendencies: Each game itself is broken down into a series of pitcher versus batter & batter versus pitcher matchups. At many ordinary moments, and at a few of the critical ones, one side or the other will get a match-up that favors them strongly. You can examine historical pitcher-to-batter match-up info, and sometimes you get the roughly 20 plate appearance history that crosses the line for a pretty good indicator. It beats the heck out of the model I pointed to in the last entry -- matching up, for example, 3rd basemen head-to-head, because rarely will one team's 3rd sacker interact in a focused way with the opposition's.

If you add these to the momentum factors like Tippett's, you can better gauge the probabilities of one playoff team winning over the other. The shortness of the series makes the random factors more likely to weigh in than they would over a whole season, but better teams usually win more series, even the short ones.


In non-baseball organizations, you can apply these models to your own "Playoff Challenges".

Your short, intense projects or efforts or campaigns will be more highly affected by random factors. High-quality achievement will be more diluted by a smaller series of events because "Quality will out in the end" is a long-term, glacial, process.

That doesn't mean you ignore quality, but it does mean as a manager you have to attend to more small details and respond more quickly, being more aggressive about matching people up in ad hoc teams to complement skills, being more willing to accept certain kinds of imperfections in exchgange for harvesting the unforeseen advantages that pop up.

It's a partially different skill set from being successful over the long haul where long-term probabilities as known by "The Book" are likeliest to yield the highest returns. The short, intense effort requires knowledge of the book, but a greater willingness to run against it at the "right" time, and those "right" times will dominate more of the outcome in the shorter series of events.

This difference is why some managers, even ones with long records of mediocrity, seem to have a knack for helping their teams win playoff series, while others who manage well over a long season have a knack for lower yields in playoff series.

To succeed at your own Playoff Challenges, embrace metrics, but embrace the knowledge that randomness will have a proportionately higher effect on the outcomes than on a longer effort. Grab the edges the environment offers you, even the ones you wouldn't normally pursue, make the most of every event, every choice, every affordance. The playoffs can be fickle, but you increase your chances if you are flexible and bold.

Tuesday, September 28, 2004

Lessons From Playoff Predictions  

Too many managers don't know how to handle the metrics their organizations provide them. Given that absence of understanding, they are probably not in a position to ask for help creating better ones, and it's a long reach to the best situation: A manager designing metrics for the specific group and situation.

It requires numeracy, yes, but a lot of very numerate people don't have enough craft knowledge to be able to carve out measures that differentiate the relevant from the less-critical. Just being numerate is not enough -- understanding context is a mandatory foundation for success.

There are some great examples of context, both present and missing, to learn through baseball. With the playoffs coming up, I thought it'd be particularly cogent to use some examples of analysis people use to attempt to predict who will win playoff series.


My very least favorite form of baseball playoff analysis is the position-by-position compare and contrast. Because the match-ups aren't yet jelled, I have no example to point you at, but if you read daily papers or weekly sports pubs, you're almost certain to see at least one set of this form that never fails to subtract from human understanding. The model works off the assumption that even though the game does not involve individual duels between each team's player at each position (the way basketball almost does), that if you match up each team's player at a position and compare and contrast them, then decide which team is better off at that position, sort of add up the "in favor of" count for each side, it will reveal some basic trend.

It's a basic 7th grader world view: Sheldon had to choose who to invite to the Supermall on Saturday; Britney is popular with his peer group, she already has some breast development, and she knows the name of the Brewers' utility infielder and his batting average; Kaysie on the other hand doesn't have pimples, is reputed to make out, gets good grades, and her family has a well-stocked refrigerator. Britney 3, Kaysie 4. Choice decided. (By the way, there are lots of adults who make decisions made on counting, sometimes by weighting and adding-up, more of a 10th grade world view. Most of these adults are male, but not all. Weight and add-up systems like this can be useful for examination and discussion and sometimes for winnowing a big pile of choices to a manageable number, but they're rarely useful for making a final decision).

Not to pick on the Milwaukee Journal specifically, I present one of theirs from three years ago (because it was the best example I could find in ten minutes of searching). Here's snippets from their analysis, just enough so that if you're not familiar with the model, you'll get the drift:

World Series: Position-by-position matchups

Last Updated: Oct. 26, 2001


New York's Tino Martinez vs. Arizona's Mark Grace: Martinez hasn't hit much in the post-season but is still counted on to be a primary run-producer for the Yankees. After years of playing with the foundering Cubs, Grace is finally getting his World Series spotlight. He doesn't have the same pop as Martinez but is a solid .300 hitter. Grace had to leave the decisive Game 5 of the NLCS with a strained hamstring, which could still be a problem. Edge: Even.


New York's Alfonso Soriano vs. Arizona's Craig Counsell: Soriano is a special rookie who broke the backs of the Seattle Mariners with his ninth-inning home run in Game 4 of the ALCS. He also makes mental mistakes at times. Counsell is one of those players whose destiny is to shine in the post-season. He appears to have ordinary skills but produces extraordinary results in October. Edge: Even.


New York's Derek Jeter vs. Arizona's Tony Womack: Jeter had a miserable ALCS, batting .118 with no extra-base hits. That should make the Diamondbacks very nervous, because Jeter is a fabulous player who usually does something to win a game or two each series. Womack has speed but does not get on base enough to fully utilize it. He has not been much of a factor in the post-season. Edge: Yankees.


New York's Chuck Knoblauch vs. Arizona's Luis Gonzalez: Knoblauch plays left field at times as if he is a converted infielder, which he is. He still does some nice things, however, such as batting .333 out of the leadoff spot in the ALCS. Gonzalez had an uncharacteristically poor showing vs. Atlanta (.211) and must carry a bigger load against the Yankees. Edge: Diamondbacks.


New York's Bernie Williams vs. Arizona's Steve Finley: Williams hit home runs in the last three games of the ALCS and now has 16 post-season blasts, fourth on the all-time list. Finley doesn't have the same flair for dramatic blows but is still a productive player and fielder. Williams just seems to make his hits count the most. Edge: Yankees.


New York's David Justice vs. Arizona's Erubiel Durazo: Justice is an experienced DH who knows how to handle the role. Obviously, no Arizona players have that background, but Durazo and David Delluci have been so productive off the bench that this lineup addition could actually help the D-Backs in New York. Edge: Even.


In effect, the Diamondbacks will try to beat the Yankees with two pitchers: Curt Schilling and Randy Johnson. Schilling might even go with three days of rest and be available for Games 1, 4 and 7. No pitchers have thrown as many pitches this season as that duo, so you have to wonder how much gas is left in the tank. The Yankees don't have to resort to such tactics because they go four-deep with Mike Mussina, Andy Pettitte, Roger Clemens and Orlando "El Duque" Hernandez. All four have come through in the post-season in the past. Four arms usually beat two, even two outstanding arms. Edge: Yankees.


New York's Joe Torre vs. Arizona's Bob Brenly: Brenly did a nice job this year in getting the most from a veteran bunch, and has a smart bench coach in Bob Melvin. But Torre has four World Series rings in the last five years, almost certainly clinching a spot in the Hall of Fame. He usually pushes the right buttons, mainly because he has the most weapons. Edge: Yankees.


The Diamondbacks are to be commended for getting to the World Series in their fourth season, the fastest assent ever. And they are an experienced group that knows what it takes to win. But this World Series stuff is old hat for the Yankees, who always seem to do the right things in October. And, with the city recovering from the horror of Sept. 11, New York has more incentive than ever to become only the third team with four consecutive crowns. Edge: Yankees.


It's going to be interesting to see exactly how far Schilling and Johnson can take the Diamondbacks. Arizona better go on top while they are in the game because the bullpen is a disaster waiting for a place to happen. New York already has beaten the two teams with the best records in the majors this year. The Yankees' record in the last three World Series is 12-1. This one won't make it back to Phoenix.

Yankees in five.

When you read the individual head-to-heads, you'll see the writer is capable of good insight. For example, at DH, he doesn't assume the well-known Series-tested Dave Justice is automatically better than the young pair Erubiel "The Hermosillo Hammer" and David Dellucci because his name is more recognizable. He's open to the youngsters' strengths, too. On each comparison, he makes reasonably-informed & reasonable assumptions. A problem is, they are all presented as roughly equal, and presumed to be additive, as if an Intangible equalled a 3rd baseman equalled a Bullpen.

There are ten thousand other reasons this doesn't work as a prescient predictive model, one of the most important being it compartmentalizes things that should be viewed as complex, interactive systems. Some things do lend themselves a little to this kind of breakdown (schoolground basketball for one, where the head-to-head matchups are likely pretty consistent during a game, and where one of the matched players can usually play better in that match-up than the other and if one team has 3 or 4 edges and the other 2 or 1, the outcome will be the indicated one fairly often).

The subset of politicians who are primarily poll-driven instead of ideologically-based are prone to this counting-as-measuring kind of analysis. This position stands to win X voters in region Y, while the opposite might win N voters in region Z.

Again, it's not necessarily destructive, just a model that doesn't work in complex systems (and most important decisions one makes are about complex systems, not simple ones). Tools that make us reduce the variables we are going to marinate in and distill down the number of options for each we will consider are necessary steps on the way to conclusions. When we reduce and oversimplify prematurely, though, we are counting on chance to give us a boost. People who work this way can be veritable gushers of incompetence.

I think Commissioner Bud Selig forms many of his opinions based on hyper-simplistic model, and I think that's why it was "inconceivable" to him that an All-Star game could go into extra innings and the teams would run out of players.

It's easy to make decisions this way. It's not generally useful. In fact, I consider it The Curse of the Budbino.

In the next entry, I'll point to a clever set of fairly simple set of metrics one of the brightest sabermetricians uses to try to predict playoff success.

Saturday, September 25, 2004

A Red Sox Lesson on Managing "Otherness":
The Vicarious Wakefield  

"Ninety percent of this game is half mental"
- Jim "The White Tony Oliva" Wohlford

There's a management bromide that declares a good manager can manage anything, a belief that management is content-independent. To some degree, it has a basis in fact; a good and experienced manager can manage related but different endeavors based on existing tools and a determination to learn new information and an openness to temporarily dispose of one's own habits and embrace others. It drives a manager to bend to a new environment, and perhaps bend the environment a little, too.

But there are limits to the capacity of managers to bend, and the Boston Red Sox coaching staff are facing an exemplary case of that right now that shows the limits of managers to bend to "otherness". That otherness is Tim Wakefield, on the surface, merely a cotter pin on the vehicle of their playoff hopes but in reality, a pretty important small element that might prevent a wheel (or more) falling off at a critical juncture.

"Otherness" comes in different flavors. Insofar as a manager actually can manage "anything", it helps if that thing has some common points of reference. Someone who can handle a warehouse, its staff and operations and paperwork, and inventory objects can generally manage those if you change, for example, the objects from electronics to furniture, or from toys to drugs. Hire new staff and move to a new facility, and she can accommodate that otherness. Change the support technology (forklifts, wagons, software, racks), and a good manager will learn the new (at least) well enough to get by. But take the greatest CFO in the world, and thrust him into this role, and he might or might not succeed. The otherness of people-supervised and systems one uses as tools and the rhythm of the work might just overwhelm even the best financial manager. Worse, under stress, people tend to revert to the tried-and-true methods that have brought them success in the past, reducing how well they can bend to accommodate to the otherness.

But the key element that limits the ability to a good manager to manage "anything" is coaching. Coaching is a mandatory element in any managerial assignment that has reporting staff. And while you can coach what you can't do all that well yourself, that ability takes a lifetime of effort or the kind of luck that wins lottery grand prizes to succeed at it.

That limit is proving a hard nut for the Red Sox in their pursuit of success in the playoffs.


The limit is the recent swoon of the Red Sox' putative #3 starting pitcher, knuckleballer Tim Wakefield.

In playoffs, one competitive edge is to have a "stopper", a monster of a starter who both has great stuff & nerves of steel. Someone like Randy Johnson or Kevin Brown or Curt Schilling, a pitcher who even if beaten on a specific day, you know plays pretty consistently at a high level and who you can be confident gave you their best on that day.

But the best situation to have is three pitchers who are all reliable and get you into the 6th or 7th inning while either dominating or keeping it close enough that a rally can get you back into the game.

If you have that troika, it means you can be creative and aggressive with bullpen matchups because you have a little slack in preserving your bullpen for the next game. And because in a seven-game series it means you won't be rolling out as a starter some designated victim who is likely to be overmatched, because your troika can usually start all the games in a seven-game series. The Red Sox have two outsanding starters in Curt Schilling (playoff record in 87 innings is 5-1, 1.86 ERA, 91 K and a miniscule 73 baserunners allowed), and Pedro Martínez (53 Innings, 4-1, 3.10 ERA, 54 strikeouts, 53 baserunners allowed).

But their third "best", by tradition, would be Wakefield, a veteran who has "been there before". But Wakefield, aftre having a perfectly fine #3 starter season through the end of August (11-7 record, team record of 14-11 in games he started), turned into a pumpkin in September, a bit premature for Halloween one might note. His gruesome September (in reverse chronological order) looks like this:

Date Opponent Score Dec IP H R ER HR BB K   W L   IP ERA BAA
 Sep 20 BAL L 6-9 L 4.1 5 8 7 1 5 7   11 10   176.0 4.96 .251
 Sep 15 TAM W 8-6 - 5.0 6 4 4 0 3 2   11 9   171.2 4.72 .251
 Sep 9 @ SEA L 1-7 L 4.2 7 7 2 1 3 2   11 9   166.2 4.64 .251
 Sep 4 TEX L 6-8 L 6.0 8 8 8 2 2 4   11 8   162.0 4.67 .250

Except for Texas, these teams are not offensive powerhouses. The team needs Wakefield to snap out of it, though it's not necessarily fatal if he doesn't, but to have the best chance in the month ahead, they need all the help they can get, and benching Wakefield means putting (probably) their #3 hopes on Bronson Arroyo, a promising young pitcher with a good 2004 but who has some extreme splits a manager would need to cover for (he's a much lesser talent when pitching in Fenway, and while he's very good against right-handed hitters, he's C- against lefties, so most opposing managers can stack a lineup with lefties against him).

Why is Wakefield struggling? ¿Is it because, as Jim Wohlford might say, 90% of pitching is half-mental? Nobody on staff can point a finger with any confidence, because Wakefield's "othernress" is too powerful, because as a knuckleballer, he's a staffer who's hard to coach because pitching coaches usually have no experience successfully throwing a knuckleball, an unusual pitch that the vast majority of pitchers don't throw because it's so different from other pitches.


The first tack most pitching coaches and other advisors take with solving a pitcher's bad-stuff problems is the dreaded "mechanics". I think a player messing up mechanics is a frequent cause, but I also believe "mechanics" is a placeholder word many use when they have no idea what the problem is.

But in baseball it's the first place a coach will look because it's something there are tools to analyze and, as a rule, something a coach could have an effect on. A brief search engine attack turned up Johan Santana, Sir Sidney Ponson, Matt Morris, Victor "Not The Entertaining" Zambrano, Jeff Weaver, and Bartolo "Are You Gonna Finish That?" Colón as pitchers whose mechanics problems were addressed this year when things started to turn sour for them over an extended period. In all those cases though, the pitching coaches who worked with them had themselves thrown those pitchers main pitches. Each, therefore, had personal insight into the physical and mental sequence required to be successful with that pitch.

The knuckler, though, is Other. Sure, about every pitcher (and non-pitcher) in the game has messed with one occasionally in warm-ups, but it is mastered by a few, and a small percentage of people who master get to use it in the bigs (for the change management and internal political issues around the game's prevention of knuckleballers, I recommend Jim Bouton's fun & informative memoir, Ball Four). Furthermore, the variation in the way individual knuckleballers throw the pitch when they're being successful is much higher than it is with the more common pitch choices. And because virtually all knuckleballers count on that pitch as their main or only pitch, they can't fall back on another when it's not going well.

But the worst constraint on a struggling knuckleballer is that the pitching coach and other team advisors are unlikely to help the butterfly artist in the way the coach could help Ponson, Morris, Colón, et.al., because the pitching coach never mastered its feel himself. And it's logical that a pitching coach who was a career knuckleballer would almost never be able to cut the mustard helping more standard hurlers because if the coach had been mastered a normal three-pitch toolbox, the coach wouldn't have fallen back on the knuckleball, a last resort.


So that makes any prospective coach or other helper fall back on the mental approach -- positive thinking, a pat on the fanny, few words of confidence. Phil Niekro once said knuckleballers don't think like other pitchers, and that makes some sense, because the central organizing principle of a knuckleballers work is the opposite of any other pitcher's. The standard pitcher's success depends on spin/rotation -- even most 98 mph fast balls that don't have a tail or cut or some x or y axis movement beyond that applied by gravity are hittable once timed by the hitter. Obversely, the successful knuckleballer relies on the absence of what all others rely on the presence of (spin). The standard pitcher is trying to throw predicatably, the knuckleballer unpredictably (well, not trying...there's just no way to know which movement the good pitch is going to have once it leave the hand).

It's not surprising, then, we haven't seen news stories about some Boston organization coach looking at Wakefield's mechanics. Even a consiglieres from outside the organization, the indomitable Charlie Hough, a long-lived knuckleball specialist himself, doesn't try to counsel Wakefield on his mechanics. In this recent Boston Herald article (courtesy Baseball Think Factory's News Blog discussion), about Hough's view, the essence was:

``Remember, it's hard to win with this (knuckleball). He's just fallen into some bad habits, but they really aren't much of anything.

``It's just confidence. That's all he needs to get going.''


Red Sox pitching coach Dave Wallace, who was in the Los Angeles Dodgers organization with Hough during the 1980s, asked the knuckleballer, who lives in Anaheim, to stop by and meet with Wakefield at Fenway.

``More than anything I just came by to offer some support for the things he knows,'' Hough said. ``I've known him since he was in Double-A or Triple-A with the Pirates and I know he has a lot of ability. And I know he has the ability to repeat (his success).''

Even one of the more successful practicioners doesn't want to try to analyze and coach what he can't understand.


Have you known a manager who could coach what he doesn't do well? It's possible, but when an experienced employee is stumbling or in a funk or a fallow period, the less the manager knows about the content, the fewer tools she'll have to choose from and the more likely she'll need to rely on confidence-building or "mental" approaches. For the recipient, that technique is undermined if the report knows she knows more than the manager (and trust me, when these situations occur, the reports usually know they know more). And if that mental aspect is not the primary cause of the slump, it most likely won't work. It doesn't make it impossible for that manager to to do a decent job if other things fall very well -- just a challenge that should be attempted only in the most dire of situations.

If a manager's job has direct reports, even the manager who can manage "anything" is going to face a big challenge without decent craft knowledge or a passionate commitment to learning it.

If he doesn't, it's like hitting a knuckleball. And as Charlie Lau, the Royals' legendary hitting coach once said: "There are two theories on hitting the knuckleball. Unfortunately, neither of them works".

Wednesday, September 22, 2004

A Minnesota Twins Knowledge Management Lesson:
Johan Santana Meets Supernatural Carlos Santana  

All organizations, baseball or not, get virtuoso performances out of contributors. Successful ones nurture the achiever, but to continue benefiting from the virtuousity, to really make the most out of it, they need to come to understand how the contributor achieved that performance so they can help her and others recreate those achievements.

That process, analysing the basis for success (and failure) and then cloning it is the essence of knowledge management (long definition here, with some background here).

A lot of metrics and analysis is flat. Let me use a baseball example based on the American League's best pitcher this year, the Minnesota Twins' Johan Santana. His virtuousity is remarkable, almost like his American namesake Carlos Santana, he of the transcendant, always recognizable riffs (Isn't it amazing that even if he's just backing up someone who plays a completely different kind of music, you can always pick him out? I've been trying to master just one of his leads for over two years now, and I'm still trying to figure out a few of the transitions which seem to defy the limits of five fingers and a Fender).

You can get a ton of data on Johan Santana. There's the really simple, flat metric presentation (courtesy of MLB.COM):

2004 15 6 3.06 28 28 1 1 0 0 188.0 136 68 64 24 47 213
Career 38 18 3.68 145 69 1 1 1 1 584.1 499 254 239 65 213 611

This tells you what he's done on a seasonal level, like the usual accounting statements non-profits and business put together. You can tell he's having a very successful year within a short but successful career. But it isn't actionable. You can't expect success by telling Johan or his fellow Twins hurlers to please go out and earn an ERA of 3.06 & strike out a lot more guys than they walk and expect to infuse success into the group.

Better, there's the sophisticated annual historical record (courtesy of Bigleagueplayers):

Last 3 years Team G GS W L IP H R ER HR BB K ERA WHIP BAA
2002 MIN 27 14 8 6 108.1 84 41 36 7 49 137 2.99 1.23 .212
2003 MIN 45 18 12 3 158.1 127 56 54 17 47 169 3.07 1.10 .216
2004 MIN 29 29 16 6 195.0 137 68 64 24 48 224 2.95 0.95 .196
Career 146 70 39 18 591.1 500 254 239 65 214 622 3.64 1.21 .228

This adds some interesting rate (quality measures) at the end, WHIP (baserunners allowed per inning; anything under 1.3 is good, and anything under 1.0 is totally superb) and BAA (the composite batting averages achieved against him). More illuminating, more specific (you now have a glimmering into why he's won so many games; bnatters don't hit much against him, he strikes out more than a batter per inning, and his walks must be relatively low because his WHIP is under 1). Again, not actionable in a significant way.

Better yet, there are the really fine "split stats" (here courtesy of Bigleaguers again) that lump performances into various components to see if there are specific strengths and weaknesses in the overall performance.

Total 32 32 19 6 0 1 1 217.0 151 68 64 24 49 254 2.65 0.92 .195
vs. Left 32 0 0 0 0 0 0 53.1 37 - - 5 8 52 - 0.84 .195
vs. Right 32 0 0 0 0 0 0 163.2 114 - - 19 41 202 - 0.95 .195
Home 20 20 11 4 0 1 1 137.1 95 42 40 14 31 167 2.62 0.92 .194
Away 12 12 8 2 0 0 0 79.2 56 26 24 10 18 87 2.71 0.93 .196
Day 17 17 9 5 0 0 0 117.1 83 38 36 13 34 138 2.76 1.00 .199
Night 15 15 10 1 0 1 1 99.2 68 30 28 11 15 116 2.53 0.83 .189
Grass 10 10 7 1 0 0 0 66.1 45 19 18 9 16 74 2.44 0.92 .190
Turf 22 22 12 5 0 1 1 150.2 106 49 46 15 33 180 2.75 0.92 .197
Indoors 21 21 12 4 0 1 1 145.1 98 44 42 15 31 174 2.60 0.89 .190
Outdoors 11 11 7 2 0 0 0 71.2 53 24 22 9 18 80 2.76 0.99 .204
April 5 5 1 0 0 0 0 28.1 30 18 17 5 8 24 5.40 1.34 .273
May 6 6 1 3 0 0 0 32.2 42 22 21 6 11 30 5.79 1.62 .313
June 5 5 4 1 0 0 0 37.2 21 10 10 5 6 46 2.39 0.72 .160
July 6 6 3 2 0 1 1 46.0 14 7 6 4 15 61 1.17 0.63 .095
August 6 6 6 0 0 0 0 43.1 29 11 10 4 7 52 2.08 0.83 .188
September 4 4 4 0 0 0 0 29.0 15 0 0 0 2 41 0.00 0.59 .150

I highlighted in Pepto-Bismol pink some junk results in here -- numbers that should not have been presented because they can't tell you anything. There are almost no turf parks left...the Twins park is one of them, so all this can tell you is how he pitched at home (already presented) along with a few crumbs of how he pitched barely elsewhere.

But here is some actionable information if you were looking to deconstruct Santana's strengths & weaknesses to try and help him ampfiy the good and dampen the not-so-good. But hark, his splits are extraordinary. I'm going to use BAA (batting average hitters achieve against him) as the bellwether for this. If you can believe them, his performance is extraordinary. BAA by left-handed hitters is .195, and right handers .195. He doesn't just crush lefties (many left-handed pitchers can do this), but he is equally transcendant against righties. Okay. ¿What else?

Home and away. Many pitchers learn to take advantage of their home park. But again, he's essentially the same home and on the road, allowing .194 and .196 BAA. This tells you he's either so overwhelming that it doesn't matter where he's pitching or alternately, he's figured out how to use almost any park he's in to his advantage (or both at the same time).

You can also see if you look at the last splits, by month, his season started with a mediocre April, an Epicacal May and consistent excellence since. Was he injured? Did he learn something? Does he do poorly in cold weather? Probably not the latter, since his smallish sample September is a cool weather performance month, and he looks veritably Nazgulish (uh, 41 strikeouts & 2 free passes in 29 innings). But this is a data set that leads us to better understanding of the components of the success, as well as a set of questions with which to follow-up.

But again, this is more about Santana's excellence than what he does to be excellent. How does he achieve that kind of performance, and more importantly, what is he doing that others might emulate?


To find that out, you have to break it down to its unit level. If you're an analyst, you need to examine what he does in a game, inning-by-inning. You probably can go down as far as individual pitches and no farther. In non-baseball work, you'll have to pick your own level, but start small and get your hands dirty in data instead of just looking at bigger pictures, because you might see patterns in individual-event data you might otherwise miss.

I found Seth Stohs' website thanks to Aaron's Baseball Blog. Stohs has masterfully analysed Johan Santana's most recent pitching performance, and although he's blended in some gushy star-eyed fannish superlatives, I'll snip those puppies out in the interest of the stodgy, academic flatness of prose for which I'm known. He's analysed it by looking at the game the way a competitor would, pitch by pitch. He answers the questions:

  • What kinds of pitches did Santana throw?
  • At what counts did he throw which pitches?
  • What effect did he get from each?

Here's some Seth showing how really good analysts present data and information both (rasty formatting inherited...I'll try to clean some of it up, but be warned, Seth's intelligence is a lot higher than his HTML skill) with my comments in bold:

One of the things that people define "Ace" with is a guy who, when the team needs a win, gets the win. Sunday, the Twins didn't really need the win, but obviously the team would prefer to go into Chicago on a positive note. I can't imagine a more positive note... Yesterday was as dominant a pitching performance as I have seen. Here are the basic numbers:

...............IP H .R ER BB SO
Johan Santana 8.0 7
.0 .0. 0 14

Incredible. Impressive. Amazing... Enough superlatives? The 14 strikeouts was his career high. He completely shut down and baffled a hot Orioles team. His scoreless inning streak increased to 30 (snip). It was his fourth straight start in which he didn't allow a run. It was his 12th consecutive start where he got a win. (snip)

Let's dive into Johan's pitching performance yesterday. (snip) I charted each of his 103 pitches and noted the type of pitch, whether it was a ball or strike and what the speed of the pitch was. Again, the speed of the pitch comes from what showed on Fox Sports Net. Here are just some interesting things to note from the game.

Of the 103 pitches that Santana threw, 80 of them (78%) were strikes. 67% is generally considered good. I don't know if I've seen this impressive a percentage before. [he looks at a basic indicator and then puts it into context for others]

Here is a breakdown of the type of pitch that Santana threw, (snip).

Fastball ...........57 (55.3%)
Change Up
..........21 (20.4%)
...25 (24.3%)

[Note, he lumped two smaller categories together, Curve and slider, either because he couldn't tell the difference, or because alone either was too small to take into consideration.]

Here are the number of pitches he threw each inning and the type of pitch:
1st inning - 13 pitches (7 fastball, 3 curveball, 3 changeup)
2nd inning - 13 pitches (7 fastball, 4 curveball, 2 changeup)
3rd inning - 14 pitches (8 fastball, 2 curveball, 4 changeup)
4th inning - 12 pitches (5 fastball, 4 curveball, 3 changeup)
5th inning - 15 pitches (10 fastball, 2 curveball, 3 changeup)
6th inning - 11 pitches (6 fastball, 2 curveball, 3 changeup)
7th inning - 15 pitches (9 fastball, 4 curveball, 2 changeup)
8th inning - 10 pitches (5 fastball, 4 curveball, 1 changeup)
Total ......103 pitches (57 fastball, 25 curveball, 21 changeup)

[Good breakdown here. The average # of pitches it takes a pitcher to get through an inning is around 15-16, and as you look through this table, you can see the inning he labored the most was 15 throws. You can also see he kept mixing his pitches up. More detail needed, but Seth will give that to us; this is a good foundation.]

It was interesting to me that Santana seemed to be stronger as the game went on. [Good analysts insert their opinions, presented as opinions and not as capital T Truth, as well as just crunching numbers, as Seth does here] His best, most dominant inning may have been when he struck out the side on 10 pitches in the 8th inning. [Good analysts point out highlights] Santana was consistent with his fastball throughout, but check out the velocity of the pitches he threw by inning (Note - please recall that I did not differentiate between a curveball and a slider. I think Santana threw more sliders late in the game):

Inning --------Fastball Curveball Changeup

1st inning ---- 91.7
------81.3 ---77.3
2nd inning ---- 90.6
------78.5--- 77.0
3rd inning ---- 91.6
------80.0--- 79.3
4th inning ---- 92.4
------82.8--- 76.7
5th inning ---- 92.6
------82.0--- 77.3
6th inning ---- 92.5
------85.0--- 77.7
7th inning ---- 92.9
------80.5--- 83.5
8th inning ---- 93.0
------84.3--- 79.0

[Seth doesn't give you conclsuions on this but its presented so well that other analysts, in this case, Yours Turley, can provide some insight. Santana's fastball and changeup were both faster at the end of the game than at the beginning. He was throwing easy, not his hardest, through the early going. As the game wore on, he either started tiring and throwing harder to compensate, or intentionally sped things up to put the hitters' timing off.]

Did Santana alter the pitches he threw each time through the batting order? The O's had four hits the first time through the order. They had just one the second time through and two hits the third time through the order. Santana struck out the top two hitters in the Orioles lineup, Brian Roberts and Tim Raines, in their 4th plate appearances. [Seth knows but isn't telling because he believes most readers will already know that hitters tend to perform better against pitchers the 3d and 4th times they face them in a game because the batter has already seen what the pitcher has to throw today and can better guess which pitch as well as time it better. If a pitcher Ks a batter in a 4th appearance, the hurler either has so much mojo that day, or is so unpredictable, that the hitter is helpless in the face of that quality].

- --- -- FB FB% -- -CB CB% -- -CU CU% ---Total
--- ---- --19 57.6%--- 8 24.2% ---6 18.2% ---33
--- ----- -18 52.9% ---7 20.6% ---9 26.5% ---34
--- ----- -16 55.2% ---8 23.5% ---5 14.7%- --29
4th (2 batters) 4 57.1%
---2 28.6% ---1 14.3% --- 7

So what does this show? (snip) Putting all of this together, it really just verifies the fact that Santana is willing to throw any pitch at any time. He will throw the fastball a little more than half the time, and the rest of the time, he picks between the curveball and changeup, and all three pitches are incredible.

Santana was so dominant yesterday. He only had one 3 ball count. Actually, he had just three 2 ball counts the whole game. Just again to illustrate how unpredictable Santana is, take a look at the pitches he threw on each count: [FB = fastball, CB = curve or slider, CU = change up]

Count -FB CB CU
---17 10 2
---12 5 -2
- -- 7 2 -6
- -- 4 1 -2
- -- 8 2 -1
- -- 8 2 -7
- -- 1 0 -0
- -- 0 2 -0
- -- 0 1 -0

It is interesting to me to see that Santana does seem to throw more breaking balls with two strikes. Of the 14 strikeouts, seven came with the changeup, four with a curveball and three with the fastball. So, what is his strikeout pitch? Any of the three. [Great stuff. Conslusion supported by data. Note, on the one 3-2 count he got to all game, Santana had the unmitigated gall to throw a curve...an outside-the-Bachs move if there ever was one] [Seth didn't tell you one of the most important things, which perhaps he overlooked, but again, his presentation was so thorough, no plaque, no junk, that it jumps right out for other analysts. I'll get to this in my next paragraph]

Stohs doesn't point out that there are two missing counts here, 2-0 and 3-1. That's really important. Because those are THE two Red Meat hitter's counts, the counts on which batters have statistically the best chances for success (intermediate and beginning fans, if your team has runners on base and the count gets to either 2-0 or 3-1, this is a time to start rhythmic clapping, even if the P.A. system operator doesn't know to put on the claptrack). Hitters try to work a pitcher to get to a 2-0 or 3-1 count because it means the pitcher is overwhelmingly likely to throw a fastball (for most hitters, the easiest, relatively, to hit).


This is a classic model of analysis and presenting data about it. Regardless of what line of work you're in, if you want to communicate both to dedicated peer analysts and more raw outside observers, slipstreaming Stoh's style here is of great value. Some of the key elements:

  • Present data, starting with the big picture
  • Drill down and present what's significant
  • Provide conclusions clearly labeled as such

This is actionable. This is the kind of detail that makes knowledge management possible, and that, in turn, makes future success, both for the individual contributor and the team as a whole more likely.

In the next entry, I'll show you a recent illustration of the Anti-Stoh, a big ugly failure of metrics, and how to avoid such fact-plaque.

This page is powered by Blogger. Isn't yours?

free website counter