Continuing our series of think-alouds about Negro Leaguers, I want to respond to feedback from our great friend Verdun2 who strongly encourages us to dive on in. So let’s focus on why we haven’t committed to that electoral process yet. The most basic reason is this: The Negro Leagues can be frustratingly complicated and hard to keep simple.
To draw an analogy, think about building a bridge. We want to build something that spans a river so we can drive from the town on one side to the town on the other side. Simple! Well, not so fast. First we need to do a traffic study to know how many vehicles would likely use the bridge. We need to know what kinds too because tractor-trailers can’t use every bridge. We also want to study the economic impact on the area. How much would this bridge improve commerce on both sides of the span? Oh, and noise pollution. Whosever house is near the termini of the bridge might need to invest in earplugs and will certainly consider bringing suit to stall the project if property values will take a whack. We probably also need to do an environmental impact study for the fauna and flora in the waterway itself. Of course, we have to find just the right site, one narrow enough to make the project viable, on ground that can handle the pilings, but also situated such that nearby existing roads can handle the increased traffic. Depending on the location, we might also have to determine whether we’re obliged to do an archaeological study. Do we need to make a drawbridge to accommodate river traffic? Earthquake proofing? Hurricane proofing? Big thing is, who pays for all of it? Do the towns on both sides split the bill? What if one will benefit more than the other? Worse yet, what if the towns are in different states? Now we’ve got to get the project through two legislatures plus two town councils and maybe even two county commissions. Is this going to be a toll bridge?
Likewise, the Negro Leagues start with the simple idea that we elect the 29 best Negro Leaguers ever. Things go cray-cray almost immediately thereafter. Before 1920, the Negro Leagues weren’t even leagues! They were independent touring teams who might loosely affiliate to play for larger gates on the weekend in between barnstorming runs. After 1920, teams still barnstormed even with an official slate of league games. In the late 1930s, the Mexican League started siphoning off talent, and those guys might have played in better leagues in Mexico than in the US. When baseball finally integrated, all bets were off. Players dispersed to the four corners of organized baseball, into every league and classification imaginable.
To evaluate these players well, we need to be able to put all these players onto a level footing. Each season needs to be evaluated separately and then brought into synch with all the other seasons. Worse yet, because some later Negro Leaguers entered MLB (not worse for them but for analyzing them), we have to put all the seasons of all the players we look at onto an MLB footing. We can’t compare Oscar Charleston to Willard Brown to Minnie Miñoso without putting them onto a major-league scale. Furthermore, we can’t then compare them with MLB players across history without leveling them up to MLB.
No worries, we have a simple solution. We’ll just translate their stats from their league of origin to the majors! All we need to do that is to compare them to the average of their destination league and then apply a quality of competition discount, and voila. Well, and we need to adjust their numbers for their home park. And possibly for standard deviation in the destination league because it was very top-heavy. Oh, and we’ll need to rescale their translation to the destination league’s run-scoring environment. Oh, oh, and we’ll need to figure out how to assign playing time, and when we do that determine a fair way to extrapolate the translated stats to full-season because the Negro Leagues and minor leagues often played shorter schedules than the big leagues. Defense…that’ll be tricky.
Each of these items I’m jokingly mentioning here is real, and they each require data and a protocol to figure them. For one single season, we need at least the following background data:
- The player’s stats in the originating league
- The player’s originating team’s games
- The team’s park factor
- The originating league’s totals
- A translation factor for the originating league
- WThe runs/game for the destination league.
Depending on how specific we get, things become more complex yet. We may need to run comps for things like baserunning or fielding. We’ll need that extrapolation protocol of course. We also need a way to scale up innings for pitchers. If we want to adjust for standard deviation, that’s a whole nother conversation and batch of data. We’ll want a protocol for double checking a translation against real MLB players.
That park factor? We’re going to have to calculate our own, and we don’t have home/road splits, so we’re going to have to make a less precise calculation based on RS/RA, especially because some of the Negro League and minor league parks were quirkier than Fenway by orders of magnitude.
Are you getting the picture yet?
So this is why we haven’t yet decided whether we will proceed. We don’t yet have a sense of how long merely gathering up the data will take. Actually, let’s check in on that too. See, the thing is that while much of the data is now extant, it’s not always utterly complete, and sometimes it is simply not available. For some leagues we do not know league totals for important things like doubles, triples, and homers (the Mandak League) or anything at all (most sub-AAA minor leagues prior to 1950). Anyone out there got a lead on the Venezuelan or Domincan summer leagues of the 1950s?
Probably the most difficult information we’ll need to locate is another factor we’ll need to determine ourselves. The translation factor is the engine driving the bus here. There aren’t many guideposts to go from. While MLEs (minor league equivalencies) have been around since the mid-1980s (thank you, Bill James!), the calculations don’t go backward in time from there. We have no specifics on how good the Mexican League of 1940 and 1941 were, let alone the Negro Leagues. Working up studies for every league we’d be looking at would take years. We have some ideas for how this could work, which is a subject for the future, but for now, this is one of those issues we have to feel our way through before undertake this process.
We’re thinking hard about how this could go. The ramping-up period could be a few months to gather all the data and test all the protocols. And with the Negro League Database growing by a few seasons each year, we have a strong incentive to take it slowly anyway. The more patience we have, the more data we’ll ultimately have at our disposal.
So you’ll have to keep waiting for an official announcement as we assess the viability of this part of the project for us and what kind of timeline we could accomplish it on. We warn you that it might roll out much slower than our other elections have, but if that’s the case, it’ll be because we are two complicated guys who want to get this very complicated task done as well as we are able. After all, our name is on each plaque too.