you're reading...

Major League Equivalencies for Negro Leagues Hitters

Posted by eric ⋅ October 25, 2017 ⋅ 41 Comments

Filed Under Major League Equivalencies, MLEs, Oscar Charleston, Rube Goldberg Contraptions

About a month ago, we told you about our Major League Equivalency (MLE) protocol for Negro Leagues pitchers. That 26-step protocol, swelled as it is by subroutines of all sorts, will seem genuinely straightforward compared to this what we’re about to unleash. But stick with it, the truth is out there, and we’re trying to use every tool we can to get at it. And, hey, we’d like to know if we can make it better, so your feedback is super helpful. On the other hand, this monster of a post is about 6,500 words, so if you want to just trust us, you can. But this sucker is here for your reference and ours if we ever need it.

As we go along, discovering more nuances, nooks, and crannies, we may have occasion to edit this methodology. When we do, edits will appear in red, and those elements affected will be shown in gray to indicate that they are no longer up to date.

The Big Picture

Our goal in creating MLEs is twofold. First, we want every Negro Leagues player’s records to be recontextualized onto a level, neutral platform. That’s because in the Negro Leagues, the different teams, leagues, parks, seasons, whathaveyou, were not of uniform quality and behavior. In fact, they varied far more than the majors did. So just among the Negro Leaguers themselves, we need this to help us make wise electoral decisions. On top of that, however, we want to get a sense of how these guys compare to MLB players so that we can place their achievements into a context that’s more familiar to us.

We don’t have a lot of interest in creating component stat lines: homers, RBI, strikeouts, HPBs. That stuff is all fascinating in its way, but we don’t place specific importance on those figures here at the Hall of Miller and Eric. We prefer to look holistically at what the combination of those recorded events meant in context—value, if you prefer. So we’re after WAR, just as we are with pitchers.

The process of locating a Negro Leaguer’s value in his collected statistics (at the Negro Leagues Database and elsewhere) follows a similar path to what we did with pitchers, in as much as we find his rate of production, recontextualize it with z-scores, adjust for Quality of Play and park effects, then apply it to an MLB playing-time estimate. Simple enough, right? But we also want to pay attention to things like what position the player would have played in the majors, his fielding value, his baserunning value, and, depending on what seasons in MLB history we’re talking about, his double-play-avoidance value.

The work we do initially as we translate performance and arrive at an initial playing-time estimate requires very little manual intervention on our part. It’s when we get into running and fielding that we are forced to make some decisions on our own. That’s where the human element comes in and careful judgment becomes our watch-phrase.

But when we get done, we have an estimate of what kind of value a fellow would rack up in the big leagues. It’s not a perfect estimate, though it’s the best we currently know how to do. We would also express the caveat that MLEs are likely best absorbed first at the career level then, with careful discernment, at the seasonal level. Due to shorter schedules and concomitantly increased volatility in stat lines, and despite some effort on our part to dampen that volatility, year-by-year MLEs simply won’t be as reliable as a career value. Which itself is an estimate, not the holy gospel of the Negro Leagues.

But other than that, they are perfect indicators of performance and value….

Before we get going, let’s define our terms

This is just going to go down easier if we use the same lingo. And since I can’t hear you to agree to a particular jargon, you’ll just have to use mine. These terms will pop up a lot.

Originating League/Team: The team he actually played for
Destination League: The league we are translating his stats into and creating an equivalency for
Quality of Play (QOP): Which assumes that MLB is 1.0, and everything else is discounted from it
Translated: Stats that been transformed from the originating league into the destination league’s run and league-quality context; an intermediate step en route to the fuller equivalent performance
Equivalent: Stats whose basis is in translated figures but that include further adjustments to place the player into a broader MLB context and ensure that small samples don’t overly skew the results.

There’ll be the usual alphabet soup along the way, and we’ll define the acronyms as we go.

The Process…in Prose

So let’s begin by my explaining this protocol in English, then we’ll run through a real-life example from the career of the great Oscar Charleston.

Translating Actual Performance

1) Find the rate of player’s offensive performance.
2) Compare to his own league.
3) Place into MLB context.
4) Adjust for the quality of his league.
5) Adjust for his park.
6)Adjust for his strength of schedule.

Creating an Initial Estimate of Playing Time and Batting Runs Above Average (Rbat)

7) Express as RC, then figure his translated Rbat,then express the result per PA.
8) Use his translated Rbat/PA if he accumulated 200 or more PA or supplement with the surrounding seasons this way:
8a) For season n, use the seasons n+1 and n-1 at full strength, combining their Rbat/PA with that of season n and finding their average weighted by PA. If they total PA of all three seasons total 200 or more, stop there.
8b) If they don’t total 200 or more PA, fold in seasons n+2 and n-2 at 60% strengthy and determine the average weighted by PA with n+2 and n-2’s PAs at 60%. If the total PA of all five seasons is 200 or more, stop there.
8c) If they don’t total 200 or more PA, determine the number of PAs shy of 200 and fold that in at full strength to the equation in 8b.
More information here.
9) Estimate the player’s games played into an MLB schedule based on his in-season and career durability records.
10) Apply the destination league’s PA/game to those estimated games.

So, now we have an initial estimate of the player’s MLB Rbat for the season. But before we go any further, we need to fine-tune our playing-time estimate because everything else after this depends upon it.

Fine-tuning Playing Time

11) First, we look at any batting seasons at the beginning or end of a player’s career. If they are well below average, we will not consider them part of his MLE—either he’d have been in the minors or, at the other end of things, aged out of the game. A general rule of thumb is that after age 38, two seasons well below average batting is probably enough to retire someone.
12) Next, we look at the player’s biographical material. If there are any injuries that would keep him out of the lineup, we see if our PA estimate reflects it appropriately. Because of the less stable nature of the Negro Leagues, we also look for league/team jumping, and other oddball movements that would affect playing time but didn’t occur in MLB. We adjust accordingly.
13) Now we want to look give our man’s career a shape or trajectory that respects both his actual durability and major-league career norms. This requires several sub-steps (sorry!) that we do as background work to all MLEs and apply to each individual.

13a) Find all non-catcher position players with 4,000 or more plate appearances whose careers began before 1960. Include all catchers from the same period with 3,000 or more trips to bat.

13b) Adjust all seasons to 154-game notation.

13c) Separate catchers into those whose careers started before 1920 and those whose careers started after, we’ll call them deadball catchers and liveball catchers. We do this because norms for deadball and liveball catchers were quite different.

13d) Create respective quartiles of all non-catchers, deadball catchers, and liveball catchers and for all quartiles based on their career plate appearances. This will result in what we’re calling the short, medium, medium-long, and long career groups.

13e) For each group determine its players’ median career length in seasons and plate appearances. For the two catcher groups, find the standard deviation of each group’s career plate appearances.

13f) Next list the plate appearances each member of our groups received at each age he played in MLB and find the median of everyone in the group who played in the league at that age.

13g) Now take all the non-catchers and group them by position. Repeat steps 13d through 13f for each non-catcher quartile (you will already have that info for both catcher populations in question). You’ll now have short, medium, medium-long, and long career groups at all positions, for all non-catchers, and for deadball and liveball catchers.

13h) Using the standard deviations of career plate appearances we’ve calculated, for each group at each position, including the catchers, determine a maximum and minimum boundary for career plate appearances by adding two standard deviations’ worth of plate appearances to the group’s mean career plate appearances and also subtracting two standard deviations from its mean.

You will now have all the background information necessary to find the career trajectory for MLE purposes.

13i) Determine the position/positions of the player (you may use more than one if you wish) and whether his career fits into the short, medium, medium-long, or long groups. For the latter look at both number of seasons and his projected career plate appearances compared to the group maximum and minimums, calculated in 13h.

13j) For each age we project the player to have been in MLB, find the average plate appearances of all players in his group at his age and that of those at his position in his group.

13k) For every age he is projected into the majors, find the ratio of his plate appearances at that age to a reference age. Use age 30 unless the age-30 season of the player in question was unusually low or high in plate appearances for him.

13l) Now, referring to the player’s projected age-30 (or other reference age), for each age projected in the majors, multiply the result for that age from 13k by his age-30 season.

13m) Turn to previous calculations of the mean career plate appearances at his position in his group (short/medium/medium-long/long), subtract that mean from the maximal playing-time boundary found in 13h and divide by the median number of seasons for the player’s position-group then divide by two. That results in the seasonal standard deviation of plate appearances.

13n) For each age the player is in projected into the majors, multiply 13m times two and add it to the result for the appropriate age calculated in 13l. That yields the maximum number of plate appearances we will give the player in that season.

13o) If the player’s projected plate appearances fall under the age-maximum in 13n, stop because you have your final number of plate appearances. If the player exceeds that figure, assign him the figure determined in 13l.

13p) Lastly, this all assumes a 154-game schedule. Adjust down to the appropriate schedule length for all seasons prior to 1904 as well as 1918 and 1919, or up to 162 games for any seasons after 1960 (AL or 1961 (NL).

Don’t worry, it’ll make sense when we run through the example. The primary motivator of this very detailed and lengthy sub-routine is to avoid instances where the player accumulates 400–600 plate appearances as a teenager or very young player when no one in the majors would do so. It also avoids the situation where a declining player picks up 600 plate appearances into his forties and instead declines in a normal major-league fashion.

Now we look for players with similar careers and styles to our man. We look for players at his position (or a similar one), roughly during his time, with similar offensive and defensive profiles (OPS+ and Rfield are good barometers here), and who fall roughly within a similar number of seasons as a regular. Once we have a bunch of them identified, 10–12 is best when possible, we look at their playing time, especially for and across the ages that we are including in our MLE. We look for a rough number of plate appearances to shoot for.
14) Finally, we remove any seasons that at a very young or very old baseball age appear to be below replacement level. This step ensures that we don’t give playing time to fellows who would not have earned it but who in the Negro Leagues could have gotten the call earlier or stayed in the league longer thanks to differences in quality of play and the talent pool. A good rule of thumb is to eliminate any season off the end of a player’s career after two straight sub-replacement years.

Armed with the information in #11, we adjust very early and very young seasons first to fit both general trends among all players and specific trends among comps. This should get us pretty close to our goal. Beyond that, we can add or trim as necessary to get to a career with a reasonable playing-time approximation.

Now that we have an initial estimate of the player’s MLB Rbat for the season and a more realistic estimate of his playing time. We can move on…finally.

Estimating Baserunning

BBREF uses a regression-based formula derived from stolen-base information to estimate baserunning for seasons prior to 1930 (after which they have complete play-by-play data to rely on). But we don’t have quite the same level of information they do for our Negro Leagues players. We do, however, have enough to take a swing at it in a different way. Please see this explainer for details.

We can do something along the lines of the investigations of prewar baserunning we did for Sam Rice and others. Here’s how we’ll do it.

15) For each season with the necessary information, find a player’s stolen bases per opportunity.
16) Find the same for his teams.
17) Find the same for his originating leagues.
18) Adjust his rate for his own team’s tendency to run or not.
19) Compare the team-adjusted rate to his league and figure his percentage of steals above or below league average.
20) Now, find MLB players from the PBP era with similarly long careers and find similar percentages of SB above the league. Because Negro Leagues boxscores may not always have carried stolen base info, it’s OK to pad this by as much as double for players with very speedy reputations, so, for example, 125% of league becomes 150%. (This is kind of a pain, so we’ve only used four to six at a time, 10 would be better.)
21) Find the average Rbaser/PA of those MLB comps and apply it on a per-season basis to the player’s estimated PAs.
22) If the candidate has a pronounced decline in his net steals versus the league, sculpt the trajectory of his running runs appropriately.

Estimating Double-Play Avoidance

We only estimate runs from double-play avoidance (Rdp) from 1930 onward because this is when BBREF’s data kick in, and we do it indirectly.

23) Identify lots of MLB batters of the same handedness, similar Rbaser, and similar career length, and calculate their Rdp/PA.
24) Apply the group’s average in Step 17 to each season of the player’s career.

Estimating Fielding Runs

The samples in the Negro Leagues are pretty small, so we need to mix together the DRA information we see in the Negro Leagues Database with a good dose of real MLB careers. This will give us a value to plug into the column called Rfield on BBREF, though we aren’t using Total Zone or Defensive Runs Saved as they do because the Negro Leagues Database helpfully uses DRA (Defensive Regression Analysis) instead. But first, we need to determine our man’s position.

25) Determine the player’s position for a given season or career by examining where he played in the originating league and how well he played there. If he started at shortstop, was bad at it, then moved to another position and was average or good, we might consider putting him at the latter position all the time. This is a subjective judgment, and we should look at real big-league careers for examples.
26) Find the player’s DRA/G rate in his originating league at whatever position or positions he will be placed at in our MLE.

November 26th, 2017: Please note that we’ve created a more objective means to generate fielding estimates, so step 26 is now out of date. Here’s what step 26 looks like now:

26) Find the player’s career DRA/154 games.

26a) Find the fifty players with the highest number of appearances in the Negro Leagues at the player’s position (because the Negro Leagues Database lacks DRA for some seasons where it lacks games, only count the appearances in seasons that include DRA), and for each figure their career DRA/154 at that position.

26b) Find the standard deviation of DRA/154 among the 50 players in 26a.

26c) Repeat the process in steps 26a and 26b for the major leagues for the same period of time the Negro Leagues Database covers (1887––1948 as of this writing), substituting BBREF’s Rfield for DRA. (Technically, I used the BBREF Play Index, setting it to seek, say, shortstops with more than 50% of their games at the same position, and returning their career Rfield. This will have to be close enough because otherwise, we’d be at it for weeks.)

26d) Divide the player’s DRA/154 (step 26) by the standard deviation of the Negro Leagues players at his position (step 26b) then multiply by the standard deviation result for MLB players in step 26 c. This is the player’s MLB career Rfield/154.

27) Apply that rate to the number of games in a season imputed by the estimated PAs we’ve assigned earlier in this process.
28) Check whether his defensive performance declined over time, and make any seasonal adjustments necessary to mirror that.
29) Double check against real MLB careers to see if the number of fielding runs generated are reasonable.

Determining Positional Runs

30) We do this exactly the same way that BBREF does here, based on the position we have assigned the player.

Calculating Runs and Wins Above Average

31) Now for each season we add the player’s Rbat, Rbaser, Rdp, Rfield, and Rpos to get his Runs Above Average (RAA).
32) To convert that to Wins Above Average, we follow BBREF’s instructions here.

Calculating Replacement Runs and Wins Above Replacement

33) We calculate replacement runs (Rrep) just as BBREF instructs us here.
34) Next we turn those Rrep into the player’s replacement-level wins per BBREF’s instructions here.
35) Finally, we add the WAA and replacement-level wins to get WAR.

The last thing we need to do is simply make sure that the MLE isn’t out of whack with real players and leagues. If we’ve estimated Josh Gibson for 1000 Rbat, we’re making a mistake. We are also making a mistake if we estimate him at 200. It’s always good to double check our work.

We’re now topping 1600 words. Are we there yet? No. Now we’ll get into the dirty-finger-nail details. Let’s take Oscar Charleston for a spin, and see how this all plays out in reality. There’s going to be a lot of moving parts, and if you’re following along at home, you’ll want to get another beer now.

A real example: Oscar Charleston, 1921

Now, we’ll run through this with Oscar Charleston’s 1921 season. This will reveal some nitty gritty details about what performance measures we use, and how we place players into an MLB run-context. I’ve only given you a framework, but you can use any old measurements or transformations you want to.

OSCAR CHARLESTON 1921
Originating league: Negro National League (NNL)
Originating team: St. Louis Giants
Destination league: 1921 NL

We have chosen our default destination league as the National League. We use the AL only when a player’s first appearance is in it.

1) Find the rate of player’s offensive performance

I’ve chosen to use Bill James’ Runs Created (the 2002 version) due to its relative simplicity. Not that any run estimators are all that simple, and we’re going to turn it into RC/PA. But first I need to address three small things: strikeouts, grounded into double plays (GIDP), and reached on error (ROE). BBREF creates estimates for these and/or simply includes them in its batting-runs estimate for players. Although we want to maintain some degree of compatibility with them, very little of the data we will work with includes this information. To keep things as simple as we can, we will not be assigning a player an estimate of what the average hitter in his league would accumulate in those categories. By not including them, we simply assume the player in question and every other player in the league are average in these categories. By doing so, when we place the player into an MLB context, we won’t need to make any further adjustments for the lack of this information. We just assume once again that he is average in these regards in MLB. For most hitters, these are not a huge source of credits or debits, but it will help or hinder certain types of hitters. Sometimes you go to translation with the data you have. It’s worth noting, however, that we will estimate player’s GIDP-avoidance value for seasons after 1948, as you’ll see below, so at least there they can recoup or de(?)coup some value.

We will only be using Charleston’s 1921 NNL season, and we will not include his play in five games against Major League players. That might sound odd, but with such small samples involving only two teams, and the MLB team not always comprised solely of MLB caliber players, it gets dicey fast.

Now we can run the RC2002 formula. BBREF does not include steals and caught stealing in batting runs (Rbat), nor intentional walks and sac bunts, so we don’t either. This has gone on plenty long, and you can find James’ equation at the link I shared. Charleston bashed his way to 93 runs created in 339 PA, or 0.273/PA. Hitting .433/.512/.736 (250 OPS+) will do that. [Note: All numbers cited from here on from the Negro Leagues Database are from July 25, 2020 and may no longer be current when you read them.]

2) Compare to his own league

As we did with pitchers, we’re using z-scores. Charleston’s 0.273 RC/PA came in a league with a 0.109 mean RC/PA. The league’s Standard Deviation in that department was 0.092:

( 0.2730 – 0.1045 ) / 0.0912 = 1.851

I’m going to carry four decimal places due to significant digits. Please give your eyes my apologies.

3) Place into MLB context

The NL of 1921 had a mean RC/PA of .09885 and STDEV of .0826, therefore:

( 1.851 * .0826 ) + .09885 = 0.2516 RC/PA

4) Adjust for the quality of his league

We rate the NNL of 1921 at just above AAA level, 0.85 of MLB. If the player’s estimate is positive, we multiply his RC/PA by the QOP factor. If negative we divide by it. Charleston’s is, naturally, positive:

0.2516 RC/PA * 0.8500 = 0.2139 RC/PA

But we use it at two-thirds strength. I ran a wimpy little regression (says the untrained statistician) in excel that suggested the length of the schedule was perhaps responsible for a third or so of the variation observed in the Negro Leagues. Thus:

( ( ( 1 – 0.85 ) * 0.33 ) + 1 ) * 0.2140 RC/PA = 0.221 RC/PA

5) Adjust for his park

Using either the same park-factor calculations as we showed you in the article on our MLE process for pitching or, as in this case, park factors graciously supplied by the good folks at the Negro Leagues Database, we have a 1.0043 factor for the 1921 St. Louis Giants. We use it at half strength since teams typically play half their games at home.

0.2139 / ( ( ( 1.0043 – 1 ) / 2 ) + 1 ) = 0.21344 RC/PA

6) Adjust for his strength of schedule

If this information is available, we multiply by the strength of schedule discount if the player’s RC/PA is positive, or we divide if it is negative. We don’t yet have that information available, so we needn’t take action here. Lucky for us, however, the NLDB combines park and schedule so the number we used in Step 5 above takes both into account.

We have now translated Oscar Charleston’s 1921 batting performance into a neutral 1921 NL context.

7) Express as RC, figure his translated Rbat, then express the result per PA

Charleston batted 339 times, so we multiply by his 0.2134 RC/PA to get 72.3426 RC. We turn that into Rbat by subtracting the number of runs an average 1921 NL hitter would accumulate in 339 PAs. The NL mean in step 3 was .09885 RC/PA, which in 339 PA is 33.5102. So Charleston picked up 38.8324 Rbat in those 339 PA. That boils down to .1145 Rbat/PA.

8) Create a rolling average of Rbat to create the final MLE rate of batting performance.

Same thing we did with our pitchers’ performance. Charleston batted 339 times, exceeding our 200 PA threshold by plenty. Therefore, we will use .1145 Rbat/PA that we figured in Step 7 rather than a rolling average. Why 200? Because in the major leagues, 200 plate appearances is the threshold at which batting stats stabilize. Now, we do recognize that these are not the major leagues, but we decided to use this nonetheless because at least it had some research behind it.

5-year rolling average centered on the year in question.

( Year N * 0.60 ) + ( Year N+1 * 0.15 ) + ( Year N-1 * 0.15 ) + ( Year N+2 * 0.05 ) + ( Year N-2 * 0.05 )

In Charleston’s case:

( .109 * 0.60 ) + ( 0.120 * 0.15 ) + ( 0.072 * 0.15 ) + ( 0.048 * 0.05 ) + ( 0.06 * 0.05 ) = 0.100 Rbat/PA

9) Estimate the player’s games played into an MLB schedule based on his in-season and/or career durability records.

We will credit the player all the games he actually played then apportion the rest of the destination league’s schedule based on his career games played per team-game.

( ( 154 – team games ) * career games / total team games ) + games

The St. Louis Giants played 79 games, and Charleston appeared in 77 of them. At the career level, we only count those games in seasons we will include in his final MLE. We are not counting 1915 and 1916, and we estimate that Charleston played 1,456 games among his team’s 1,571 contests, or 92%. Thus

( ( 154 – 79 ) * 0.9268 ) + 77 = 147 (146.51 to be exact)

We cap this at 95% of the destination league’s schedule to avoid having too many seeming iron men.

10) Apply the destination league’s PA/game to those estimated games

The 1921 NL had 4.26 PA per game per lineup slot, so 146.51 * 4.256 = 623.5466 PA

If we stopped here, Oscar would have 71.3961 Rbat, which is a healthy total. Now we’re going to fine-tune our playing time estimates and then work on running, DP avoidance, and fielding.

Charleston got a lot of playing time at ages 18–20. As we’ve demonstrated previously, almost no one appears very often at that age. But players do appear. Charleston’s equivalent production for those years isn’t bad, so we’ll include them, but instead of as a full-time player, we’ll give him a very small number of PAs at 18 and 19, with an increasing number at age 20, then roughly full-time play from age 21 onward.

Charleston also played deep into his forties. Most players start to sputter badly around 37 or 38, his last productive equivalent year is probably age 37. So we’ll wind up his career at age 39 with sharp decreases in his playing time.

12) Next, we look at the player’s biographical material. If there are any injuries that would keep him out of the lineup, we see if our PA estimate reflects it appropriately. Because of the less stable nature of the Negro Leagues, we also look for league/team jumping, and other oddball movements that would affect playing time but didn’t occur in MLB. We adjust accordingly.

Charleston rarely missed a game, and we could find no evidence of extended absence due to injury. His 1919 season does have a weird two-team element to it that sheds light on how careful we need to be. The amazing Gary Ashwill told us that in 1919, Charleston played in 24 of the Chicago American Giants’ 24 league games through August 3rd. However, in the last days of July, race riots broke out in the city, and the American Giants’ home field (Schorling Park) was occupied by the Illinois state militia, forcing the cancelation of a series with the Atlantic City Bacharachs. The Giants were in Detroit when the riots erupted and stayed put but had no games scheduled elsewhere. Meanwhile, the Detroit Stars and Hilldale Club were about to kick off a series. So the three clubs ended up played a three-way doubleheader on August 3^rd. Charleston played in the first game for his own team versus Detroit, then for Detroit in game two, while other Chicagoans played for Detroit as well. He played in three others for Detroit as well. Charleston rejoined his own club for a contest with the Stars on August 9^th, and overall played 17 out of their final 18. If you just looked at the stats, it would appear that Charleston played 41 of 42 for Chicago and 5 of 41 for Detroit, or 46 out of a possible 83 games. Or perhaps you’d split the dfference and call it 46 of 63 or something. In fact, he played 24 of 24 for Chicago, 5 of 5 for Detroit, and 17 of 18 for Chicago, for a grand total of 46 of 47. Only in the Negro Leagues.

It would be good for us to quickly discuss winter league play here as well. We include it for sure, and we ultimately combine it with summer play. First, however, we work it through all the steps for translating offensive performance. Right before step 8, however, we combine any same-season stints by taking an average weighted by PAs. For purposes like this, we consider Opening Day of the Negro Leagues season as day one of a given season year. So the 1927–1928 winter ball season, or any winter ball played in the first three or four months of 1928 all counts toward 1927’s batting record.

13) We’ll save you all of the length discussion of the background stats and the age-by-age calculations. The big picture is that we used long-career center fielders, left fielders, and first basemen. Averaging them together, from ages 18–39, we get a career just shy of 12,000 plate appearances. When take our due care with each season, that number decreases to 11430 plate appearances. Thanks to the Great War, Charleston doesn’t project to a 600-plate-appearance season until age 24, but thanks to his durability and those in the majors with long careers, his last 600-plate-appearance season comes at age 37.

Specifically in 1921, the combination of long-career players at center field, left field, and first base at age 24 averaged 654 plate appearances. That figure was 1.0187 times larger than his reference-season at age thirty. The three positions average a 78.5648 standard deviation per season of plate appearances. We peg Charleston at 643 plate appearances for 1930, meaning that based on our long-career positional analysis, Oscar is in for 655.0241 plate appearances, and the maximum number of plate appearnces actually exceeds the record for plate appearances in a season. Anyway, we needn’t worry because our estimate for 1921 is 624 plate appearances and well within our boundaries, so we need go no further.

Charleston is a little difficult because the only centerfielders prior to the 1960s who hit like him are Cobb and Speaker. Charleston’s record doesn’t suggest that he was able to keep the high-octane performances of his twenties going into his thirties. More like Ken Griffey, Jr., in this regard than Cobb and Speaker. So centerfielders actually don’t provide a useable set of comps.

So we turned to heavy-hitting corner outfielders who aren’t on the Ruth level: Rbat greater than 300 but not above 600. Defensively, Charleston wasn’t great anywhere in the outfield, at least according to DRA. We take DRA’s arm runs out of the picture entirely because they aren’t all that trustworthy. But Charleston was a good first baseman. So, in general, we’re not overly worried about comping defensive performance, except we don’t want any slow-footed sluggers because Charleston was athletic, at least until he put on weight in his late twenties and thirties.

The list that BBREF’s Play Index spat back included: Al Simmons, Paul Waner, Sam Crawford, Jesse Burkett, Fred Clarke, Goose Goslin, Willie Keeler, Zack Wheat, Sherry Magee, Joe Kelley, Harry Heilmann, Joe Medwick, and Ed Delahanty. These guys had a median 9519 career PA. But we also need to account for differing schedules, especially prior to 1904. When we adjust everything to a 154 game slate, these come out at 9622 on average and a median of 9992. So our target looks like 9600–10000 PA.

14) Armed with the information in #11, we adjust very early and very young seasons first to fit both general trends among all players and specific trends among comps. This should get us pretty close to our goal. Beyond that, we can add or trim as necessary to get to a career with a reasonable playing-time approximation.

We already know that we won’t be counting Charleston’s seasons after age 40, and we also know that for his early seasons, we’ll cut way back to norms more like the major leagues, so we’re all set.

Among the comps we’ve selected, none played at age 18, four at age 19 (averaging 60 PA), seven at age 20 (average 145 PA, median 57). At age 21, they stepped up to about 350–400 PA, then from age 22 through 34 were full-time players. They began to sputter at 35, at age 36, as a group, their playing time is in full decline, and we can start to hear the death rattle at 37. Just three of them appeared after age 39. So this gives us a good idea of the shape of a career, and we’ll use Charleston’s actual production in concert with this information from comps and our info on very old and young players noted above to create estimates for his time. In 1921 specifically, Charleston was 24, and the 621 PA initial estimate we created is solid.

Now for baserunning.

When we follow the guidance in the explainer linked above, we find that across his career, Charleston’s speed translates to about 1.84 runs every 650 plate appearances. He stole 32 bases in 1921, a swipe about every fourth time he reached base, so it is possible we underrate his baserunning this season. We are relying on central tendency, however, in the absence of more complete data, and this process nets him a handsome 32 career runs on the basepaths. For 1921, at 624 plate appearances, there’s 1.78 runs in his shoes.

15) For each season with the necessary information, find a player’s stolen bases per opportunity.

With no play-by-play information, we calculate opportunities as times on base minus extra-base hits. So for Charleston in 1921 we get:

32 SB / ( ( 123 H – 44 XBH ) + 41 BB + 5 HPB ) = 0.256 SB/OPP

16) Find the same for his teams.

By the same formula, the St. Louis Giants stole 0.161 bases per opportunity

17) Find the same for his originating leagues.

The league’s SB/OPP was 0.12.

18) Adjust his rate for his own team’s tendency to run or not.

( lgSB/OPP / tm SB/OPP ) * SB/OPP

For Charleston:

0.12 / 0.161 * 0.265 = 0.198 adjSB/OPP

19) Compare the team-adjusted rate to his league and figure his percentage of steals above or below league average.

0.198 adjSB/OPP / 0.12lgSB/OPP = 159%

Charleston stole 59% more bases than his leagues.

20) Now, find MLB players from the PBP era with similarly long careers and find similar percentages of SB above the league. Because Negro Leagues boxscores may not always have carried stolen base info, it’s OK to pad this by as much as double for players with very speedy reputations, so, for example, 125% of league becomes 150%. (This is kind of a pain, so we’ve only used four to six at a time, 10 would be better.)

Charleston was known as a fast player, at least early on, so we’ll give him a little padding. We located a few long-career players in MLB’s play-by-play era who stole 70% to 90% more than their leagues: Barry Bonds (+79%), Paul Molitor (+88%), Craig Biggio (+74%), and Omar Vizquel (+82%).

21) Find the average Rbaser/PA of those MLB comps and apply it on a per-season basis to the player’s estimated PAs.

We figured this and expressed it per 10,000 PA to give us some context. As it turns out, these guys averaged 35 Rbaser/10,000 PA. We decided to push up to +40 because these fellows’ lines include a lot of seasons after age 37 (when Charleston is going to get his last big hit of playing time) and Gant broke a leg in the middle of his career.

22) If the candidate has a pronounced decline in his net steals versus the league, sculpt the trajectory of his running runs appropriately.

That said, Oscar’s stolen base totals went into the toilet after age 31, so we’re going to need to stack up most of his baserunning value in the first half of his career then decrement him after age 31. When we did this, we got 37 total, and for 1921 he earns 3 runs on the bases.

Now for GIDP avoidance.

23) Identify lots of MLB batters of the same handedness, similar Rbaser, and similar career length, and calculate their Rdp/PA.

We only take this step for seasons after 1929. BBREF uses play-by-play data to determine how man double play opportunities a player had and then compares the player to the league’s average. We can’t do that, so we do our best. Handedness, speed, ground ball/fly ball rates, and strikeout rates are the main determinants for GIDP rates. Most of the time, we only have the first two, so we find comps with the same career length based on handedness and our Rbaser estimation. We see how many Rdp the comps had per PA. Charleston’s closest 47 comps netted .0018 Rdp/PA.

24) Apply the group’s average in Step 23 to each season of the player’s career.

We don’t need to do this for Charleston since 1921 is before our 1948 cutoff. But if BBREF should add more PBP and calculate Rdp for pre-1930 seasons, we’ll follow suit. For Charlies seasons after 1929, he’s getting about 1.2 runs per 650 plate appearances, adding up to 7 Rdp from 1930 to when we close his career in 1936.

Now, for fielding, things are going to get really mushy.

Charleston was known to play a very shallow centerfield and was often compared to Tris Speaker. However, he put on weight and was essentially done as a top-flight outfielder by his late 20s. Seeing this, we’ve kept him in centerfield until age 28, shifted him to leftfield until age 32, then to first base for the rest of his career.

26) Find the player’s DRA/G rate in his originating league at whatever position or positions he will be placed at in our MLE.

For an outfielder, we probably need to disregard DRA’s arm value. In fact, Charleston’s arm value is negative, but he was known for having a rifle, so we need to build that into our estimate. In addition, Charleston’s career range value for centerfield is negative, but we are missing six seasons of defensive stats for him. When the numbers and the stories of a career don’t match, we can’t dismiss the narrative. So we’ve given Charleston an average of 0.05 fielding runs per game, which figures to around 6 or 7 runs in a full season. For a sense of scope, Andruw Jones and Kevin Keirmeier rack up 20+ runs in their best seasons. Let’s run through the rest of his career too.

We are going to place him in left field beginning at age 29. We learned earlier that Charleston was slowing down rapidly around this age due to putting on weight, so in leftfield, he’ll start out above average by about the same rate he was in center, then quickly drop down to below average in four years. Finally when he moves to first base, where he had decent DRA totals, we’ll make him a little above average. He ends up with +51 defensive runs.

November 26th, 2017: Please note that we’ve created a more objective means to generate fielding estimates, so step 26 is now out of date. Here’s what step 26 looks like now for Charleston:

26) Find the player’s career DRA/154 games.

Charleston’s career DRA/154 games in centerfield through age 29 (when we’ll move him to left field) was 44.3 runs.

26b) Find the standard deviation of DRA/154 among the 50 players in 26a.

That standard deviation is 17.01.

26c) Repeat the process in steps 26a and 26b for the major leagues for the same period of time the Negro Leagues Database covers (1887––1945 as of this writing), substituting BBREF’s Rfield for DRA. (Technically, I used the BBREF Play Index, setting it to seek shortstops with more than 50% of their games at the same position, and returning their career Rfield. This will have to be close enough because otherwise, we’d be at it for weeks.)

For MLB it is 3.14.

44.3 / 17.01 * 3.14 = 2.05 runs/154 games

27) Apply that rate to the number of games in a season imputed by the estimated PAs we’ve assigned earlier in this process.

We mentioned just now that we’re giving him 0.50 Rfield per game, which in 1921 will net him 7 Rfield in 146 games.

We project Charleston to 146 games, so he will get 2.05 runs / 154 games * 146 games = 1.9 Rfield. He ends up with 32 career fielding runs.

28) Check whether his defensive performance declined over time, and make any seasonal adjustments necessary to mirror that.

We covered all of this a little earlier.

29) Double check against real MLB careers to see if the number of fielding runs generated are reasonable.

Yes, we think so. From 1871–1960, among players who played at least 40 percent of their games in centerfield, +29 runs would place him near Steve Brodie and Terry Moore. The true greats like Speaker, Carey, Ashburn have more than 80 Rfield, and many of the players with higher Rfield totals have many fewer PAs. Additionally, some 19^th Century players, playing under shorter schedules, would rank higher than Oscar given a 154-game slate, or if they already rank higher, would put much more distance between them and him.

Now we’ve got all the difficult stuff out of the way, and we’re on the WAR express.

30) Figure positional runs. We do this exactly the same way that BBREF does here, based on the position we have assigned the player.

In 1921, Oscar accumulates -3.5 of these. Centerfield was a much more offense-oriented position then than now.

31) Now for each season we add the player’s Rbat, Rbaser, Rdp, Rfield, and Rpos to get his Runs Above Average (RAA).

In 1921, we estimate equivalent values of 71.1 Rbat, 1.8 Rbaser, 1.6 Rfield, and -3.5 Rpos, which total to 71 RAA. Recall that we didn’t calculate Rdp, but if we did, we’d include it here.

32) To convert that to Wins Above Average, we follow BBREF’s instructions here,

That reckons to 7.0 WAA.

33) We calculate replacement runs (Rrep) just as BBREF instructs us here.

19.3 of those Charleston in 1921.

34) Next we turn those Rrep into the player’s replacement-level wins per BBREF’s instructions here.

2.0 of those.

35) Finally, we add the WAA and replacement-level wins to get WAR.

And 17 hours later, we’ve got him at 9.0 WAR. That figure would have placed second in the 1921 NL and third in MLB among hitters. Here’s the leaderboard if we include pitchers:

Babe Ruth: 12.6
Red Faber: 11.0
Rogers Hornsby: 10.8
Oscar Charleston: 9.0
Urban Shocker: 8.5
Burleigh Grimes: 8.0
Dave Bancroft: 7.4
Sad Sam Jones: 7.3
Carl Mays: 7.5
Frankie Frisch: 6.9
Harry Heilmann: 6.8

At the career level, among all hitters from 1871–1960, Charleston ends up with:

11430 PA (5th)
544 Rbat (18^th)
53.3 WAA (18^th)
91.0 WAR (14^th)

Given Charleston’s reputation as among the very elite of Negro Leagues players, this MLE could be conservative. And that’s OK if it is because we’re still missing a little bit of data for him (some winter seasons), and because we’d always rather underpromise and overdeliver, at least metaphorically speaking. What I mean is that if we come in with numbers that are sky high, they aren’t going to be credible. We need to arrive at estimates that resemble real MLB players. It’s too easy to trumpet someone’s greatness on the basis of our figures then realize we’ve made an embarrassing error in logic or computation or data entry. When a player has less data attached to him than Charleston (for example, see our write up on Bullet Rogan), we can’t go all-in on half a career. We need to temper our estimates in the absence of data that could just as easily deflate them as inflate them. In general, we got to be careful. We want this to be about the players, not about us.

We’re exhausted from mansplaining all of this. We can only begin to imagine the depth of horror you’ve experienced reading along. If you have suggestions for improvements, we are all ears. We aren’t perfect at this, we’re just trying our best. Next, we’re going to evaluate position players whose Negro Leagues careers launched them into the Hall of Fame or Hall of Merit. We’ll begin with catchers, so fans of Josh Gibson, Biz Mackey, Louis Santop, and Quincy Trouppe should tune in for that one. And we’ll also give you a delightful bonus surprise.

Discussion

41 thoughts on “Major League Equivalencies for Negro Leagues Hitters”

“We use it at half strength since teams typically play half their games at home.” Not in the Negro Leagues. The better teams typically had more home games than away or neutral site games. In 1921, out of 79 games with Box Scores, St. Louis played 43 games at Giants Park.

Posted by Kevin | October 25, 2017, 12:07 pm

Reply to this comment
I understand your position on doing the individual elements of runs (singles, doubles, triples, homers). It works, so long as there isn’t an extreme difference in something like home runs. Guys who played in Japan and the majors hit homers at about half the rate in the majors they did in Japan. I don’t see how your approach could successfully deal with that kind of issue. That said, I’m not aware of any circumstances even as extreme as the Baker Bowl in that regard, much less as extreme as the difference in home run rates between the majors and Japan. It’s something I’d be aware of, though.

Posted by Jim Albright | October 25, 2017, 4:03 pm

Reply to this comment
- Jim – I think what you’re saying is: If we have Ichiro and Matsui both playing on the same team/same season in Japan, and both have 10.0 RC/G, but Ichiro has zero HR and Matsui has 50, then using Eric’s method their MLE’s are going to be identical in RC/G, at 8.5 RC/G or whatever it comes out to be. If you do ‘component based’ MLE then Ichiro will be close to 10.0 but Matsui might come out at 7.0 RC/G. I don’t think either method is ‘wrong’, it just depends on what your goal is in doing an MLE which determines what methodology would be best.
  
  Posted by KJOK | October 25, 2017, 5:35 pm
  
  Reply to this comment
  - The objective in my mind is generally to put all the Negro Leaguers on a single plane, and then to find a general estimate of likely MLB value. We really can’t say with much certainty what any given park or statistical factor might have looked like, especially for players from before the 1940s, so the most defensible thing to do, in my opinion, is to translate their value from one context to another. I do not discount at all examples like Ichiro and Matsui, but then again, in MLB, the difference in offensive value persisted. Matsui: 14.2 Rbat per 600 PA. Ichiro was 13.1/600 through age 36 before he declined.
    
    Posted by eric | October 26, 2017, 9:05 am
    
    Reply to this comment
  - Basically, if you don’t have a park that’s either hell on homers (Griffith Park) or, on the flip side, consistently turns little more than pop flies into round trippers, it shouldn’t be a worry. Maybe the fact it would be only for half the games would reduce the problem, and the mobility of Negro Leaguers would tend to minimize the problem by putting less extreme parks in the home record to further dilute the effect of an extreme park. In Japan, triples are a huge difference, but there’s so few of them it wouldn’t matter much, but that huge difference in homers (six or more times the difference in average) can really change how a player is viewed.
    
    What I would suggest is at least flagging anybody with a situation that can seriously bias things like that. such as raising or dropping homers by a net of twice as much as it affects batting average. It’s not even as terrible for a Josh Gibson, in that he should be so far over the HOF line even with Griffith Park that we’re only talking about underestimating how great he was. But if a guy is in a Griffith Park situation and falls just a little short of the HOF line with your method or is in a bandbox that raises his homers as much as Griffth depresses them but seems a little over the HOF line, your approach could wind up with seriously wrong answers on even a HO in/out call in those cases. At the very least we should know about the extreme home run park guys (over a career) whose park could be responsible for inflating their homers and thus leading to a rating just over the line, or whose park could be responsible for deflating their homers and thus leading to a rating just below the line. Maybe there are no such cases in the Negro Leagues. A very real case in Japan is a guy named Keishi Suzuki. He gave up quite a few homers in Japan, but limited the other stuff well enough to be a member of their HOF. But take those high homer numbers in Japan and translate them to the majors, and I wonder if he’d have been above average because he’d be expected to give up a ton of homers.
    
    Posted by Jim Albright | October 27, 2017, 9:09 pm
    
    Reply to this comment
Actually, I do know of one as bad as the Baker Bowl–Griffith Park, which was a nightmare for hitting home runs. There may be other Negro League parks with influences that large, There, I’d be very cautious that your approach to the individual elements does not lead to serious distortions.

Posted by Jim Albright | October 25, 2017, 4:08 pm

Reply to this comment
You never lost me, well worth the 6500 words, like Jim, I wonder how many Baker Bowl/Griffith Park stadiums exist, or whether teams didn’t have an even home/road split, but the work here is fabulous, thanks doc!

Posted by Ryan | October 26, 2017, 10:44 pm

Reply to this comment
I presume you have a copy of Green Cathedrals. It has some Negro League park information on things like distances to center, etc.
v

Posted by verdun2 | October 27, 2017, 9:26 pm

Reply to this comment
HO midway through my second paragraph of today’s post should be HOF

I think if a park uniformly increases or decreases scoring in terms of average and home runs/XBH, your approach should work fine. But when those two elements have large disparaties, it can seriously mess up how you evaluate a player. In Japan, you cite Matsui and Ichiro’s production in the majors. Well, in Japan, because so much of Matsui’s value was in homers, he produced in a quick and dirty calculation 113.7 runs per 600 PA. By contrast, Ichiro produced 87.0 runs per 600 PA, and Ichiro was the one in a DH league in Japan. In the majors, because of the large impact of the reduction of Matsui’s homers, they were far closer than they’d been in Japan, to the point where Ichiro’s defense and base running made him more valuable.

Posted by Jim Albright | October 27, 2017, 9:40 pm

Reply to this comment
This is great work and very exciting to see!

One minor suggestion: why not compute Rbat directly for each Negro Leagues season rather than going through RC? Computing Rbat directly would be more directly comparable to BBREF’s WAR, and I think it’s supposed to be a more accurate run estimator for individual players. While I think it’s a bit more involved than the RC formula, I do think I figured out how to figure Rbat (aka wRAA, aka Linear Weights) for different run environments, so I’ll go through the 1921 Charleston example below. The key references are BBREF’s explanation of wRAA, and an SQL code with linear weights as a function of run environment:
https://www.baseball-reference.com/about/war_explained_wraa.shtml
http://basql.wikidot.com/woba

The SQL code shows how to derive linear weights for any run environment (well really, within some reasonable range…):
RperOut = lgR/(lgIP*3) = 0.19 for 1921 NNL
rBB = rperOut + 0.14 = 0.33
rHBP = rBB + 0.025 = 0.355
r1B = rBB + 0.155 = 0.485
r2B = r1B + 0.3 = 0.785
r3B = r2B + 0.27 = 1.055
rHR = 1.4
rSB = 0.2
rCS = 2*RperOut + 0.075 = 0.455

Then, one needs to reconcile these weights with the actual number of runs scored for the season in question, which you do by re-calculating the run value of the out. Here, also, BBREF calculates all averages with respect to the league averages with pitchers removed. Obviously, for the weights above, there is no sensible way to remove pitchers from the RperOut calculation, so we’ll just have to use the entire lgIP and lgR up there. But down here, we should be using league averages with pitchers removed.

RperOut_reconcile = R_lwts/(AB-H-SF) = (rBB*(BB-IBB) + rHBP*HBP + r1B*1B + r2B*2B + r3B*3B + rHR*HR + rSB*SB + rCS*CS)/(AB-H-SF) = 0.2676

The last step is to compute wRAA for the player in question. In this step we first exclude all IBB and SH, then scale wRAA by PA/(PA-SH-IBB), in other words assuming that the IBB/SH are just a managerial decision and that the player would have done just as well in his other opportunities if he had been asked to hit in those situations.

wRAA = (rBB*(BB-IBB) + rHBP*HBP + r1B*1B + r2B*2B + r3B*3B + rHR*HR + rSB*SB + rCS*CS – RperOut_reconcile*(AB-H))*PA/(PA-SH-IBB) = 59.115 for Charleston 1921

Then finally you can add back (lgR/lgPA)*PA to wRAA to get “wRC”, which is the lwts equivalent of runs created–I think this is what you would want to apply the league adjustment factor to, then subtract back (lgR/lgPA)*PA to get back to rBat. For Charleston 1921, wRC = 106.03.

The differences here are relatively minor: using RC to get runs above average gives you 56 for Charleston, compared to 59 using linear weights as above. Still, a difference of ~3 runs per 300 PA is 10 wins over a full career of 10,000 PA, which would be a pretty big difference. Plus, this method would be directly comparable to BBREF’s Rbat, except for the league adjustment and the standard deviation adjustment, so this would be, as much as possible, placing Negro Leaguers on an even footing with MLB players.

Hope this is helpful, and doesn’t come across as excessively nitpicky!

Posted by Alex K | November 26, 2017, 10:33 pm

Reply to this comment
- Alex, thanks so much for your comment! Not at all nitpick. In fact, we’d rather use a custom linear weights measure, but we are not trained in SQL at all and are not database experts. We’re using Excel formulae to do it all, which means we need a simpler means, and RC will get us pretty darned close.
  
  That said, we recognize that
  a) there’s a great deal of imprecision in our MLEs anyway, which is par for this particular course
  b) RC will have far more precision than much of what we’re doing in the other WAR categories
  c) thanks to the smart design of the WAR framework (all due credit to Tango, the Seans, and others) our RC estimate can be easily swapped out for a linear-weights estimate at the season or career levels, so that everyone can enjoy playing along at home.
  
  Thanks again for this great comment!
  
  Eric
  
  Posted by eric | November 27, 2017, 9:09 am
  
  Reply to this comment
  - Great! I agree that this is far from the dominant source of uncertainty on the MLEs…and it is certainly a big pain to compute wRAA for each season, versus something like RC. Anyway, if you’d like assistance on converting to wRAA for your MLE estimates I’m happy to help; I do have a spreadsheet for the Charleston example but of course would also need to re-compute all the weights for each year.
    
    Posted by Alex K | November 27, 2017, 1:58 pm
    
    Reply to this comment
Sorry, late to the game here. When you used z-scores in the above process, did you have a minimum threshold for plate appearances and innings pitched? Michael J. Schell had a formula in his book “Baseball’s All-Time Best Sluggers” that allowed him to calculate such a minimum for batters. If I recall, it was a PA/game formula that required roughly 300 appearances in a normal MLB year for a batter’s data to be included when calculating the standard deviation. It kept low PA guys from messing up the data. Thanks for this hard work on the Negro League players.

Posted by brianpdowning | December 20, 2018, 2:12 pm

Reply to this comment