Thursday, September 25, 2008

Corporate Cargo Cults:
Bruce Jenkins, The Duke of Moral Hazard &
His Young Pitcher Slaughterhouse  

These are the dark ages of pitching. It is a time of cowardice and fear, oblivious to the lessons of history. If there's a bond among starting pitchers of the pitch-count era, it's that they were born too late. One of life's great truisms is to finish what you start.

It's what you tell your kids, your surgeon, your contractor. This once applied to baseball, with precision, but now there's a new law: Just quit. Let somebody else finish the job. You did your part, now go be a cheerleader.

Pitch counts have destroyed not only the elements of pride and accomplishment among starting pitchers, but the art of winning. If one thing characterized the great pitchers of the past, from Bob Feller to Warren Spahn to Tom Seaver, it's that they learned how to win.

You don't get that from a "quality start" and a nice, early shower. It's when you understand the difference between a breezy sixth inning and a stressful ninth, when you brought that victory home, and can't wait to do it again. – Bruce Jenkins

Every line of work has its own cargo cults, a set of energetically-pursued methods based on a mass delusion. When they get institutionalised as S.O.P., they can be really scary. When San Francisco Chronicle sports columnist Bruce Jenkins campaigns for Baseball’s most resilient Cargo Cult, it’s downright Resident Evil: Apocalypse scary (truly heart-pounding but at the same time so patently ridiculous you can’t believe anyone would bother wasting their craft on it).

We’ll get back to Jenkins and baseball in a minute, but first a little more about Cargo Cults in general.


The best known Anthropology example of the Cargo Cult are some post- WWII Pacific islander religions.

During the war, outsiders had built airfields to land or transfer supplies to support the war, and some of the material goods got skimmed off for the residents. The war ended, the shipments stopped coming through.

That ended the goodies. The residents wanted to goodies back. Having noticed that air transports followed air fields, they presumed, therefore, that air fields generated cargo planes, so they built rudimentary landing strips they assumed would generate the traffic that provided the material goods.

Like native cults, Corporate Cargo Cults are rarely fabricated out of nothing – they’re usually based on an historical success or failure of remarkable proportion.

    * “Less regulation is always better than the amount we currently have”

    * “No one ever lost their job buying Microsoft system software”

    * “Real estate is the one investment you can never lose money in”

“Leadership” tends to institutionalize whatever response seemed to work in the crisis moment even if the causality was questionable.

Later, when the crisis has passed, people forget the context of the response and merely repeat their past behavior.

As a species, we generally benefit from automatic responses (not rethinking from scratch every response to every stimulus). So, it’s easy to take a general tendency and assume that it’s going to work in all cases, regardless of context. That’s the easiest path to walk hard.

Worse than the absurdity of the Cargo Cult, one of its critical attributes is that it actually undermines the chances for organizational success.


There aren’t many Cargo Cults in Baseball – there are superstitions and irrelevant rituals, to be sure (a pitcher not stepping on the line when walking from mound to dugout after a half-inning, or a player not changing undershirt during a hot streak). But such quirks rarely affect performance for better or worse.

Baseball, unlike Banking or Real Estate or most finance or service endeavors, is relentlessly focused on performance right now and how actions affect performance in the future, and Baseball management relentlessly focuses on facts to fine-tune behaviors. In Baseball, unlike corporate or academic or non-profit arenas, decisions are almost always measured, and decision-makers almost always held accountable. So in Baseball, Cargo Cult adherence tends to take one out of the pool with pretty good certainty and usually with some whoop-axe alacrity, too.

Baseball’s worst Cargo Cult has been on the wane for about a decade, the religion built around the value/virtue of starting pitchers throwing complete games (sometimes also the value of having a pitcher be a 20-game winner, or throwing 250 innings for a season).

The practice was changed because of belief, substantiated heavily as the general case, that heavy use of a pitcher leads to more injuries and, in many cases, prematurely ends more careers than more regulated use does. More scientific front offices and pitching coaches followed the vanguard of data-equipped, fact-based leaders like the Oakland Athletics’ and New York Mets’ Rick Peterson and started rebuilding the protocols of when to pull a starter, especially a young one (because the effects of wear vary by many factors, but most markedly by the experience (functional age) of the pitcher).

And as the adherence to the practice has been on the wane, the Cargo Cult’s diminishing number of adherents seem, as the hangers-on to any shrinking ideology or religion do, more strident and entrenched, more sure they are right. Instead of being about team wins, the argument sounds more like an exhortation to some moral imperative, as though pitching 7 strong innings and yielding the mound to a fresh arm was un-manly (cue The Four Feathers).

Bruce Jenkins, one of the more insightful and readable columnists around, has been on a Starters Should Pitch Workloads Like They Did Back In The Good Old Days (BITGOD) crusade for a while, and his recent masterwork was a pair ‘o big features (August 26 - 27) about the lost art of the complete game.

Fun, interesting, well-written, and completely eviscerated…not by examples he omitted, but by the very examples he damningly included. His citations generally make the case AGAINST using pitchers more often and for longer outings.

I recommend you read the whole piece, but here are what I consider the essential points (after the exhortationalistic epigramme at the top of this entry).

    * Pitchers from previous eras consider regulating starters’ outing length by pitch count is absurd.

    * By definition, most relievers are generally not as good as starters (if they were as good, they’d be starters).

    * Pulling a starter and replacing him in the late innings with a lesser pitcher undermines team accomplishments.

    * Back in the 1950’s, 1960’s and 1970’s, pitchers could go complete games because it was expected of them and they were conditioned to go the distance. The reason they don’t anymore is because they’re not trained to nor expected to. Jenkins: “If your job security depends on finishing a game - with 160 pitches, if that's what it takes - then you don't think twice about it, nor does your manager, general manager or owner. The act becomes as mundane as covering first base or laying down a bunt.”.

    * Contemporary marquee starters such as Tim Lincecum, Carlos Zambrano, Dan Haren and Scott Kazmir should be notching complete games at the historical, not contemporary rate, but the new practice is depriving them of that opportunity, and perhaps they enjoy the game less, and perhaps should be respected less for their accomplishments as a result.

    * If teams not only expected more complete games, but moved from a five-man to a four-man rotation as Jenkins suggests (loading the expectation of 20% more outings per season for each starting pitcher), middle relievers would pitch less and those relievers are “pretty much a joke on many teams”.

    * Teams should launch a Counter-Reformation against the pitch count reforms and start training pitchers to go the distance. It will benefit the pitchers physically and emotionally and the teams with better potential performance, and injury rates won’t be much different from today’s pitch count-driven results.

There were some great examples of BITGOD workhorses: Robin Roberts, Warren Spahn, Juan Marichal, Tom Seaver, Bob Feller. And Nolan Ryan and Jim Kaat, both heavily-worked pitchers who worked well into their 40s. A sidebar featured pitchers less known to fans who achieved great feats of game endurance, such as Joe Oeschger and Leon Cadore who both went the distance in a 22-inning game back in 1920, and Tom Cheney, who tossed a 166-pitch (thanks, 'Neck) complete game in 1962.

The reality of the pitchers he cites generally (not universally) makes a powerful counter-case to his religion.

Here’s why.

Some of those pitchers broke down because of their heroic workloads. All the others for whom we have Retrosheet data generally pitched late innings less effectively than the relievers who replaced them. That is, while a great starter’s composite performance was/is better than a merely decent reliever’s, a great starter’s late in the game (the time a reliever would generally replace the starter) performance is inferior to that of a decent reliever from that team. We have, thanks to Retrosheet.Org and Baseball Reference, detailed game-by-game and seasonal statistics for teams’ starting pitcher outings from 1956 through the present. Where I have such statistical data, we’ll look at it.

In this, Part I, entry, I’m going to cite some explanation of the pre-1956 careers of some of the workhorses Jenkins cites. In Part II of this entry, I’m going to go through all the 1956- and on pitchers Retrosheet and Baseball Reference document whose efforts Jenkins uses to support his beliefs and comment on each one.

METHOD NOTES: To compare starters with their relievers, I’m going to use ERA+ (pitcher’s ERA relative to the league average) as an indicator to determine a season that was “average” for that starter. So, for example, if Jim Kaat’s career composite ERA+ was 107 (seven percent better than the leagues he pitched in), I will find the full season he had with the ERA+ closest to 107.

Then, I’m going to find the 2nd-best reliever on the starter’s team that year. I’ll choose based on a mix of science and art – I’ll generally look over a pair of relievers: the marquee reliever and then the best from the remainder of relievers who pitched serious innings. Of that pair, I’ll choose the second best performer.

I’m choosing the second best because unlike a Strat-O-Matic tussle, a real manager has imperfect knowledge of the statistical probabilities of each individual’s success or failure. The manager finds out about effectiveness in a range of relief situations in an ongoing small-sample experiments through the season. So we need to saddle the manager with a good, not the best, choice, to relieve the great starter, a more “known” contributor.

I’m going to use the stat OPS (On Base Plus Slugging Average) as an indicator of effectiveness, of what a pitcher gave up to opponents. I’m going to compare the 2nd-best reliever’s OPS to the OPS of the subject starter’s performance facing batters in batters’ 3rd and subsequent plate appearances.

To reiterate, it doesn’t matter to the team’s overall effectiveness whether a starter’s performance the first two times he faces batters is better than a reliever’s performance; a reliever almost never replace a starter who’s cruising early in a game. But by the 3rd or 4th time in a game a starter is seeing the batter, the pitcher is more likely to have some fatigue, and the batter more locked into the pitcher’s stuff.

I’m also going to refer to TOPS+, the ratio of what the pitcher surrendered in specific situations relative to his overall performance. As an example, if a pitcher overall gave up an OPS of .759, but gave up an OPS of 626 the first time he faced a batter in a game, he’d have a TOPS+ of 94 when facing batters the 1st time (better than his average).

The complexity of analyzing Jenkins’ proposed initiative is that his argument aims at one cause and then presumes a pair of different effects. The BITGODs aim for more complete games, and moral virtue issues aside, assume that pitcher injuries won’t increase. These, I’ll call the Injury and the Moral Rectitude issues.

He also believes that the teams would surrender fewer runs/win more games if good starters both worked longer and started more often (The BITGOD Prescription). Again, Jenkins suggests that if you replace reliever innings with incrementally more great starter innings, you get better composite team pitching performance. This, I’ll call the Team Performance issue.

I’m not going to try to take on the full scope of the injury question. I think it’s a major factor, but injury data is as hard to nail down as a gyrating sea cucumber addicted to crack and hosed down with extra virgin olive oil. And Retrosheet and (even the miraculous) Baseball Reference keep no stats on Moral Rectitude, so, for now at least, I’m only going to deal with it in this Part I, and only a little.

Let’s focus for now on Team Performance. ¿Would a team’s overall pitching benefit from The BITGOD Prescription? The question should be: “Is a fresh reliever worse than a great starter by batters' 3rd and later plate appearance against that starter?” The vast majority of starters are less effective the 3rd and subsequent plate appearances against a batter than they were in the 1st plate appearance they face the batter.

  Times Facing Opp. in Game (Major League Composite, 2008) 
  I  Split         OPS   tOPS+ 
  1st PA in G    .726   94 
  2nd PA in G    .765  104 
  3rd+ PA in G   .800  113 

source: Baseball-Reference.Com

In 2008 to date, in general, starters as a whole are less effective in a batter’s 2nd plate appearance than in the batter’s 1st, and less effective still in the batter’s 3rd and subsequent appearances against them in the game. Okay, so that’s true IN GENERAL.

But the BITGODs aren’t suggesting human punching bags like the Orioles’ hapless Radicalhams Liz pitch complete games and go every fourth start; how about the great (mostly Hall of Fame) workhorses Jenkins cites? And how about the contemporary marquee starters he hopes will be the vanguard of the Counter-Reformation leading us all back to the Chef’s Salad Days of Infinite Virtue and World Series Hardware? We’ll take a quick look at Lincecum and his cohort, too, in Part II.

For the purpose of addressing the injury issues a little, let's look at the career trajectories of starters called out in a sidebar by Jenkins as heroic performers.

Jenkins writes:

In 1904, a 30-year-old Yankees pitcher named Jack Chesbro led the American League with 48 complete games.

Let’s take a quick look at Happy Jack Chesbro’s career.

  Year Ag Tm  Lg  W   L  G   GS CG   IP   ERA *lgERA *ERA+
  1899 25 PIT NL  6   9  19  17 15  149.0 4.11 3.81    93 
  1900 26 PIT NL 15  13  32  26 20  215.7 3.67 3.62    99 
  1901 27 PIT NL 21  10  36  28 26  287.7 2.38 3.25   137 
  1902 28 PIT NL 28   6  35  33 31  286.3 2.17 2.75   127 
  1903 29 NYY AL 21  15  40  36 33  324.7 2.77 3.11   112 
  1904 30 NYY AL 41  12  55  51 48  454.7 1.82 2.70   148 
  1905 31 NYY AL 19  15  41  38 24  303.3 2.20 2.91   133 
  1906 32 NYY AL 23  17  49  42 24  325.0 2.96 2.96   100 
  1907 33 NYY AL 10  10  30  25 17  206.0 2.53 2.79   110 
  1908 34 NYY AL 14  20  45  31 20  288.7 2.93 2.46    84 
  1909 35 TOT AL  0   5  10   5  2   55.7 6.14 2.52    41 
  11 Yr WL%.600  198-132 392 332 260 2896  2.68 2.96   110 
  162 Game Avg    18-12   36  31  24 272.0 2.68 2.96   110 
  source: Baseball-Reference.Com

Happy Jack was built up as a young pitcher to take increasing workloads, not unlike an early version of the BITGOD Prescription. Then, at age 30, he pitched the 454 innings, going 41-12 for the Highlanders/Yankees. He had an ERA+ of 148 (an ERA 48% better than the league as a whole).

Don’t imagine for a second that even with the Manliness build up, 454 innings and 48 complete games didn’t affect his career. The following year, he had 25% fewer starts (38) and while still an excellent pitcher, his superiority waned to an ERA+ of 133 (an ERA 33% better than the league). The year after that, increased starts, lesser performance yet, and the year after that, at age 33, his last useful year, significantly fewer starts and an ERA+ of 100 (that is, league average). And he became Crappy Jack for the next two years and was out of the majors.

Chesbro is in the Hall of Fame, by the way, and that supports Jenkins a tad, though to be fair, Chesbro’s installation is all about his durability over a medium length career and what he did in that single 1904 season.

It looks like his magnificent record year took a toll and he was never the same pitcher again. We can’t prove it was cause=workload and effect=diminished performance. But since Jenkins cited his heroism and something Lincecum should be admiring, it’s worth looking at what Chesbro’s career trajectory was and asking the question “Is that a good career trajectory for Lincecum?”

Jenkins writes:

In a 16-inning, complete-game win against Baltimore in 1962, Washington's Tom Cheney threw 228 pitches.

Here’s Cheney’s career (nice Jeff Merron narrative on him):

  Year Ag Tm Lg   W L  G  GS   IP   ERA *lgERA *ERA+ 
  1957 22 STL NL  0 1   4  3   9.0  5.00 4.00   80 
  1959 24 STL NL  0 1  11  2  11.7  6.94 4.22   61 
  1960 25 PIT NL  2 2  11  8  52.0  3.98 3.76   95 
  1961 26 TOT     1 3  11  7  29.7 10.01 3.90   39 
  1962 27 WSA AL  7 9  37 23 173.3  3.17 4.05  128 
  1963 28 WSA AL  8 9  23 21 136.3  2.71 3.74  138 
  1964 29 WSA AL  1 3  15  6  48.7  3.70 3.70  100 
  1966 31 WSA AL  0 1   3  1   5.3  5.06 3.47   69 
  8 Yr WL% .396 19 29 115 71 466.0  3.77 3.88  103 
  162 Game Avg   6 10  42 25 170.3  3.77 3.88  103 
  source: Baseball-Reference.Com

By the way, after that magnificent September 12 start, he came back six days later and couldn’t get out of the 4th inning, and then rested for twelve days, and pitched a very good start.

In Cheney’s next season, 1963, he started with four complete games, all great performances, on as little as three days rest. And at age 28, that was basically the end of his starting career. The rest of the season he had 17 starts, going 4-9 and performing around the league average. He broke down in August. While his season ERA+ was his best, if you subtract those first four starts, he was average.

And he was never a regular major league starter again.

Cheney (while he will always have a place in my heart for graciously giving me the second ballplayer autograph I ever got, a swell childhood memory) is not a support for the heroic complete game as a career-builder.

Jenkins writes:

-- New York Giants pitcher Joe McGinnity, known as "Iron Man," didn't start pitching in the major leagues until he was 28. Five times, he pitched both ends of a doubleheader. He worked an astounding 434 innings in the 1903 season, and over his 10-year career racked up 247 wins and 314 complete games. Get this, though: Wandering through the minors until he was 52, collected 204 more wins.


Nolan Ryan, known as much for his walks as his strikeouts, routinely surpassed 150 pitches as his career progressed (27 years, 222 complete games and 5,386 innings pitched). In 1974, according to beat writers in attendance, Ryan threw 259 pitches in a 12-inning win over Kansas City.

These two Hall of Fame pitchers support The BITGOD Prescription, though it’s also worth noting each was a once-in-a-generation freak of nature. Both pitched forever and had very good ERA+ marks. But note that Iron Joe didn’t have a particularly good major league season after the age of 35 for either quantity or quality. Ryan was very good even though age 44, though his game was predicated more on durability and pure strikeout power than on excellent ERA marks.

By the way, I don’t know if those beat writers Jenkins mentioned were remembering the wrong year, but Ryan didn’t go more than 10 innings in any game against the Royals in 1974. They may have been thinking about a 13 inning game against Boston where Ryan walked ten and struck out 19. An historical note -- my man Cecil Cooper, leading off for the Bosox, struck out six times in his eight at-bats that game against Ryan; what do we call that...a Golden Cust?

As freaks of nature, we can recognize that pair of iron men are figures to be respected and (ideally) emulated. But as freaks of nature, we also need to recognize their success, especially Ryan’s, might be significantly genetic, and not something coaching, nutrition or pilates can reify.

Jenkins writes:

It's remarkable enough that on May 1, 1920, Brooklyn and Boston played a 1-1 tie that lasted 26 innings. Incredibly, pitchers Leon Cadore and Joe Oeschger each went the distance. Historians estimate that Cadore threw 345 pitches, Oeschger 319. MBB NOTE: The game ended as a tie.

Here’s Leon “The Caddy” Cadore’s career:

  Year Ag Tm Lg    W  L   G GS CG    IP ERA *lgERA*ERA+ 
  1915 24 BRO NL   0  2   7  2  1  21.0 5.57  2.77  50 
  1916 25 BRO NL   0  0   1  0  0   6.0 4.50  2.67  59 
  1917 26 BRO NL  13 13  37 30 21 264.0 2.45  2.78 113 
  1918 27 BRO NL   1  0   2  2  1  17.0 0.53  2.79 527 
  1919 28 BRO NL  14 12  35 27 16 250.7 2.37  2.97 125 
  1920 29 BRO NL  15 14  35 30 16 254.3 2.62  3.23 123 
  1921 30 BRO NL  13 14  35 30 12 211.7 4.17  3.89  93 
  1922 31 BRO NL   8 15  29 21 13 190.3 4.35  4.10  94 
  1923 32 TOT      4  2   9  5  3  38.3 4.46  3.88  87 
  1924 33 NYG NL   0  0   2  0  0   4.0 0.00  3.67  inf 
  10 Yr WL% .486  68 72 192 147 83 1257 3.14  3.33 106
  162 Game Avg    13 14  38  29 16 252  3.14  3.33 106

We don’t have game-by-game stats for 1920, but if you look at his ERA+ progression, The Caddy looked like a pretty fine young pitcher.

That is, until 1920, the year of his 300+ pitch outing. He never again notched an ERA that equaled or bettered the league average.

Here’s Joe “The Big Ouch” Oeschger’s career:

  Year Ag Tm  Lg    W  L  G GS CG  IP     ERA *lgERA *ERA+ 
  1914 22 PHI NL    4  8  32 10  5 124.0  3.77  2.92    77 
  1915 23 PHI NL    1  0   6  1  1  23.7  3.42  2.74    80 
  1916 24 PHI NL    1  0  14  0  0  30.3  2.37  2.64   111 
  1917 25 PHI NL   15 14  42 30 18 262.0  2.75  2.81   102 
  1918 26 PHI NL    6 18  30 23 13 184.0  3.03  2.98    98 
  1919 27 TOT NL    4  4  17 12  6 102.7  3.94  2.98    75 
  1920 28 BSN NL   15 13  38 30 20 299.0  3.46  3.04    88 
  1921 29 BSN NL   20 14  46 36 19 299.0  3.52  3.67   104 
  1922 30 BSN NL    6 21  46 23 10 195.7  5.06  4.02    79 
  1923 31 BSN NL    5 15  44 19  6 166.3  5.68  3.99    70 
  1924 32 TOT NL    4  7  29 10  0  94.3  4.01  4.21   105 
  1925 33 BRO NL    1  2  21  3  1  37.0  6.08  4.18    69
  12 Yr WL% .414  82 116 365 197 99 1818  3.81  3.36    88 
  162 Game Avg     9 14   44  23 11 219.7 3.81  3.36    88

The Big Ouch must have been a pretty decent prospect. The Phils (not a doormat in that time) brought him up at age 22. He was middling in accomplishment up until the legendary 300+ pitch game, had his best season the following year, but was never more than a spot starter after that. I don’t see Oeschger as either supporting or undermining the Jenkins Protocol; he was middlingly useful and inconsistent before his Phyrric Tie and equally so after, the Mike Morgan of the Post-Great War Era.

Jenkins also wrote about Allie Reynolds in the hero sidebar, but while he was a great performer, he wasn’t particularly heroic. In an era where many starters pitched every 4th game, he started less frequently, but was available as an occasional reliever, so he pitched over 220 innings most years. I suspect the copydesk messed up Jenkins' assertion, but I can’t decode what that would be from what’s left.

I see no clear pattern within Jenkins’ cited heroes. A couple of freaks of nature (durability outliers), a couple of guys whose careers crapped out pretty soon after their heroics, and one guy who seemed as middling after as before. Jenkins certainly can’t use the data as a convincing support for his Prescription as a standard approach.

Look, what if you were paying Tim Lincecum, and you wanted the maximum value out of him? What if you could have 32 starts limited to 100-105 pitches a year but with an alternative; what if you could have him for 40 starts a year, but there was one in five chance he’d break down as a result and have a Leon Cadore career? Or what if it was a one in ten chance or a one in three chance.

Knowing that Lincecum is an outlier in many respects (his mechanics, his power-to-size ratios, his intellect, his past training), perhaps he’s an outlier in durability, as well. At what odds do you take the chance of having him be the next Kerry Wood or Mark Prior (outliers who didn’t transcend fatigue and whose starting careers imploded like neutron bombs, leaving no survivors to mourn)? Good question, and one I leave to you to answer.

But this just addresses Injury and Moral Rectitude issues. What about the Team Performance issue – how much better off is a team leaving in the great starter?

In the next entry, I’ll answer that by examining each of Jenkins’ cited workhorses, all of whom had lengthy careers that survived or thrived on BITGOD workloads, and compare their performances to the relievers their teams might have deployed in their steads.

