you're reading...
Uncategorized

MLE Pitching Method: Full Detail

Last week, we shared with you a full articulation of our updated MLE method for position players, and today we’ll do the same for pitchers. 

Pitchers puzzle us much more than hitters, as parts of this post will show. Our latest MLEs eliminate only some of the puzzling aspects of Negro Leagues pitchers. If anything, you’ll probably find a wider gulf between the best pitchers and everyone else than you found between the best hitters and everyone else. You’ll also find more sizzle-fizzle careers among pitchers than hitters. Intrepid researcher Gary Ashwill told me recently that Negro Leagues teams worked their aces pretty hard. With players basically being free agents every year and a fair amount of uncertainty about a team’s roster for next year, clubs played to win now and used their pitchers that way. As a result, many careers flashed brightly for a moment then winked out of existence, quarks in the aether of the baseball universe.

If you’re following along at home, enjoy. If you’re not exactly following along at home…well, you might just want to stick to the short version I’ll lay out below and skip the insanely detailed articulation that follows.

Abbreviations and Assumptions

Abbreviations

Pitching

  • BIP: Balls in play
  • DEF: Pitcher’s defensive support per nine innings
  • G: Games
  • IP: Innings pitched
  • oppRA9: League RA9 adjusted for park and defensive support
  • QOP: Quality of play adjustment
  • qRA9: RA9 adjusted for quality of play
  • RA: Total runs allowed
  • RAA: Runs allowed above average
  • RA9: Runs allowed per nine innings
  • Rrep: Replacement runs (i.e. how many runs a pitcher earns between replacement and average)
  • SD: Standard deviation
  • tmBIP: Team balls in play
  • tmDRA: Team’s Defensive Regression Analysis runs
  • WAA%: Pitcher’s winning percentage versus an average set of opponents given average defensive and offensive support and his own performance versus the league average
  • WAR%: Pitcher’s winning percentage versus an average set of opponents given average defensive and offensive support and his own performance versus replacement level
  • Z: Z-score (how many standard deviations a pitcher’s RA9 is from the league’s)

Batting

  • PA: Plate appearances
  • PF: Park factor
  • Rbat: Batting runs above average
  • rPOS: Positional runs

Value

  • WAA: Wins above average
  • WAR: Wins above replacement

Assumptions

The process will be described from a single-season point of view. We will refer to the single season in question as n. We will also mention seasons:

  • n+1: The season immediately after n
  • n-1: The season immediately before n
  • n+2: The season two years after n
  • n-2: The season two years before n

Short Description

Pitching

  1. Use the pitcher’s Z to recontextualize his RA9 into an MLB context
  2. Adjust for quality of play
  3. Adjust the league’s RA9 for defensive support and park factor
  4. Figure how many runs saved above average he would generate per inning
  5. Estimate his innings pitched 
  6. Find his full seasons RAA and RA

Batting

  1. Multiply the pitcher’s OPS+ as a hitter by the QoP adjustment
  2. Adjust for his home park
  3. Find the average adjusted OPS+ for all his seasons, weighted by his PAs
  4. Estimate his MLE PAs
  5. Figure his Rbat via a regression equation that matches OPS+ and Rbat, determine his Rpos, then add them to determine his RAA
  6. Find his batting WAR then adjust it downward using an equation that regresses Negro Leagues pitchers’ OPS+ against MLB pitchers’ OPS+

Value

  1. Use the WAR explainer at Baseball-Reference.com to determine the pitcher’s pitching WAR
  2. Add that to his batting WAR to find his total WAR.

Full Description

We’ll work through Satchel Paige’s 1934 season. That will appear in blue.

Pitching

1) Gather a player’s known statistics from the Negro Leagues, MLB, the minor leagues, and any foreign summer or winter leagues.

  • Only use seasons where league-wide data is available
    • Only use seasons where the following data is available for all pitchers: G, IP or pitching outs, RA; GS for all pitchers on his team is very helpful and so is his team’s DRA and BIP (which requires H, HR allowed, and strikeouts as well)
    • Do not include All-Star games, playoff/post-season games, nor short winter series that pit white teams against Black or Cuban teams against Black
      • It’s not possible to compare a player to a league-average player in a short series—to see why, imagine having five Toyotas and five Ferraris and trying to define the average car.

In 19 games and 16 starts, Paige went 145.67 innings, giving up 36 runs. He made the most starts on his team, and his team saved -9.5 runs defensively on 2031 BIP.

2) For season n, find the player’s RA9.

Paige allowed 36 runs in 145.67 innings, which works out to 2.224 RA9.

3)  Determine the pitcher’s RA9 Z

  • Find the league’s SD for RA9. I generally remove extreme pitchers with more than 10 or 20 RA9 because it skews the SD way downward. The NNL’s SD was nonetheless 3.867
  • Subtract the league’s mean RA9 from his RA9. This is the mean observed while calculating SD, not the league’s overall RA9. In Paige’s case that means 2.224 – 6.298 = -4.074
  • Divide by the league’s SD. -4.047 / 3.867 = -1.054

4)  Adjust the sample to reach a minimum of 45 innings (this is roughly the place where the key pitching stats stabilize)

  • If the player has more than 45 IP in season n, do not adjust the sample, and proceed. We won’t need to adjust for the sample this time.
  • If the player has fewer than 45 IP combine the PA from season n, with the IP from seasons n+1 and n-1 this way: 

(((45-nPA) * (sum(nPIP*nZscore, n+1IP*n+1Zscore, n-1IP*n-1Zscore) / sum(nIP, n+1IP, n-1IP)) + (nIP*nZscore)) / 45

  • If the sample is still under 45 IP, add the IP and Zscores from seasons n+2 and n-2, weighting them at 0.6
  • If the sample still doesn’t add to 45 IP, include the player’s career IP and career weighted average Zscore.

5) Determine a player’s initial MLB RA9 by placing him into a major-league setting: Default to the NL, but for players who debuted in the AL, start them in the AL. Like Paige. Multiply pitcher’s Z (#3) by the MLB league SD then add that product to the mean MLB wOBA. Again, we’re talking about the mean observed when generating SDs for the league. Paige goes into the AL, which in 1934 had an SD of 2.859 and a mean of 6.120. Thus (-1.054 * 2.859) + 6.120 = 3.108. That’s what I’ll call Paige’s twRA9 or translated wOBA.

6) Adjust for quality of play (QoP) by multiplying #5 by the originating league’s QoP adjustment (see table for the ones I use).

For Paige and the 1934 NNL the QoP is 0.8, so 3.108 / 0.80 = 3.885

7) Correct his QoP-adjusted RA9 to align with the league’s actual RA9. The observed mean we used to generate z-scores is not the same as the league RA9 we will judge the pitcher against, so we rescale them to match by multiplying #6 by the ratio of the league’s overall RA9 to the observed mean from step #5.

3.885 * ( 5.2 / 6.120) = 3.301

8) Adjust the league’s RA9 for the pitcher’s defensive support.

To do this, we’re going to figure how much of the team’s defensive support “belongs” to Paige and express that as a per-nine-inning figure, which we’ll add to the league’s RA9. We will cap defensive support at +/- 0.50 runs per nine because those are, generally, limits that pitchers don’t exceed except in extreme circumstances. And because DRA has a higher variance that BBREF’s Rfield, we’ll divide by two just to be safe. Then we’ll subtract that per-nine total from the league’s RA9.

The Crawfords generated -9.5 defensive runs on 2031 BIP. Batters put Paige in play 380 times in 145.67 IP, thus

((((Paige’s BIP/tmBIP) * tmDRA) / IP) * 9) / 2

((((380/2130) * -9.5 / 145.67) * 9) /2 = -0.052

Now: 5.200 – -0.052 = 5.252 RA9

9) Adjust the league’s RA9 for the pitcher’s park factor

The Craws’ home park, Greenlee Field, had a 0.978 park factor in 1934 (hat tip, Kevin Johnson) so we mulitply the league’s defense-adjusted RA9 in step #8 by the square root of the park factor:

5.252 / SQRT(0.978) = 5.197 (significant digits may affect your calculation)

10) Create an initial estimate of innings pitched by assigning him a rotation slot and matching it with the average innings thrown by MLB pitchers in that same slot that same year.

Paige was the Crawfords’ number-one starter in MLB, the top 16 starters in the league averaged 270 innings. If the pitcher is assigned a rotation slot below 5th, credit him with his actual innings pitched.

11) Create a second estimate of innings using the age-based trajectories of MLB pitchers.

For this I found the median innings thrown by age for pitchers from either 1901-1919 or 1920-1948 who fell into one of four buckets: Long career (around 3,000+ IP), medium-long (around 2,250), medium (around 1,800), and short (around 1,300). Paige falls into the long-career bucket for live-ball pitchers. He was 27 in 1934, and the typical long-career pitcher at that age threw about 270 innings.

12) Create a final MLE innings estimate by taking an average of #10 and #11 weighted 30/70.

Conveniently, both estimates give us 270 innings. The math goes ((0.3 * 270) + (0.7 * 270)) = 270

If you assign the pitcher a slot below 5th, use his actual innings pitched for your final estimate.

13) Estimate his MLE games.

Divide his initial innings estimate by the average number of games pitched by pitchers in the same rotation slot that year. Divide the final innings estimate in #12 by those games. Round to the nearest whole number. For pitchers below 5th, use their actual games. 

For Paige, that means 270 / (270 /45) = 45 games

14) Calculate his MLE RAA.

Subtract his estimated RA9 (#7) from the league’s adjusted RA9 (aka oppRA9; step #9 above), divide that difference by 9, and multiply by his final innings estimate (step #12). Thus…

(5.197 – 3.301) / 9 * 270 = 56.88 RAA

Special note 1: For player’s like Paige who played in MLB, use their MLB RA9 z-score as necessary but don’t figure anything for them otherwise. We’ll incorporate their MLB value stats as-is when we get to the next stage.

Special note 2: As I do with Rbat estimates for batters, I cap a pitcher’s MLE RA9 at the league leader’s total for that year to stay within the scale of MLB.

Pitching Value

I try to stick as closely to Baseball-Reference.com’s value calculations as I can. That includes all the background calculations for WAA and WAR. Please refer to their explainer for all details.

Paige ends up with a nifty 8.051 pitching WAR by my calculations.

Batting

1) Find the pitcher’s PA and OPS+ for each season of his career.

For Paige I only used his pre-MLB years to figure his batting. By the time he reached MLB his batting skills had probably faded. Overall, I used 656 PA.

2) For every season in the pitcher’s career, adjust his OPS+ by the league’s QoP rating and then divide by his team’s park factor.

For example, in 1934, the NNL rates as a 0.800 league, and their batting park factor is 0.973. Satch’s OPS+ was 32. So we go like this: 32 * 0.800 / 0.973 = 26.310

3) Find the career average adjusted OPS+ from #2, weighted by PAs.

This is one of those things Excel does well. Use an formula like =SUM(G3:G25*H3:H25)/SUMG3:G25), where G is the column in which PAs are stored and H is the column where adjusted OPS+ is stored. For Paige this works out to 33.940. He was not a good hitter, contrary to his own mythmaking. 

4) Estimate the pitcher’s PAs.

We use a formula for this for all pitchers. It assumes that pitchers get 3 PA for every 8 innings pitched. You’ll need to round the result. For Paige

(8/3)*(270/8) = 90 PA (rounded to the nearest whole number)

5) Estimate the pitcher’s Rbat (batting runs).

This time we use a regression formula. I ran pitchers OPS+ against their Rbat to avoid going through the entire MLE process for their hitting. With so few PAs and so little gain as hitters, that made sense. (I did separate MLEs for Bullet Rogan, Martin Dihigo, and Lazaro Salazar as position players.) The equation goes like this 

((Career adjusted OPS+ * 0.0013) – 0.1307) * PA

Paige’s 1934 season works out this way: ((33.940 * 0.0013) – 0.1307) * 90 = -7.792 runs

6) Use Baseball-Reference.com’s WAR explainer to figure the pitcher’s Rpos and RAA and then his WAR as a hitter. But don’t go anywhere.

Paige works out to 10.1 Rpos and 2.3 RAA (10.1 Rpos + -7.792 Rbat). That turns into 0.220 WAA, and for pitchers, WAA is essentially the same as WAR, so it’s also 0.220 WAR.

7) But we need to make a little correction here. Negro League pitchers were MUCH better hitters than their MLB colleagues, so we dock them for that.

When I researched the OPS+ of Negro Leagues pitchers versus MLB pitchers, Black pitchers outhit their white counterparts by a significant margin, which leads us to adjust their batting downward 35%. Paige’s final WAR batting figure is 0.220 * 0.65 = 0.143.

Total Value

Now we sum it all up. Add the pitching value we came up with two sections ago to the batting value we just arrived at, and you’ve got the pitcher’s total value for the year.

Paige’s 1934 season then works out to 8.051 pitching WAR plus 0.143 batting WAR for a total of 8.194 WAR. 

That’s two weeks in a row you’ll need your reading glasses to keep up. I promise it will get better next week.

Discussion

3 thoughts on “MLE Pitching Method: Full Detail

  1. Noting a quick edit here. I had incorrect numbers for the 1934 AL’s SD and RA9 in point 5 under pitching. That’s been corrected, and I’m sorry for any confusion. Also, I corrected the spelling of Satchel’s last name in that same sentence. In our next edition of ScrooUps, I’ll eat crow with taco sauce on it!

    Thanks to friend of the HoME Kris Gardner for point that out. Good catch, Kris!

    Posted by eric | July 23, 2021, 4:33 pm
  2. I have always thought BREF does the defensive adjustment wrong. You should be adjusting the Lg Avg RA9 by the Lg Avg BIP/9 not by the Pitchers BIP. The league average pitcher would not have the same BIP as Paige, If Paige is a high strikeout pitcher, you are suggesting the league average pitcher is that also and depends less on his fielders.

    Posted by Darren | July 25, 2021, 4:17 pm

Trackbacks/Pingbacks

  1. Pingback: Discovering Japanese Pitchers, Introduction | Horsehide Dragnet - April 11, 2022

Tell us what you think!

Institutional History