No, David Schoenfield of ESPN.com has not ensnared us in some controversy. I highly doubt that he even reads this site (though we’d be glad to have him!). Instead I’m giving a name to a phenomenon he pointed out way back in January. A phenomenon that, as Miller and I approach the latter stages of our project, we need to confirm or disconfirm.
In an article titled “What’s the Problem with the Hall of Fame,” David said this:
It was easier to elect guys such as Drysdale or Perez because they still managed to stand out among their peers; there were fewer great players simply because there were fewer teams. As the talent level in baseball gets more compacted (17 of the 31 players with 100 career WAR began their careers before World War II), it’s more difficult to put up numbers that separate you from your peers.
Check out the graph a little more than half way down the page. It tells the story very nicely. There we see Hall of Famers’ career WAR plotted against their year of debut. You can clearly see the points narrow as you move from olden times toward our modern day.
Voters, then, encounter Schoenfields’ Paradox when they see great looking statistics (including WAR) from old ballplayers who generated them against inferior competition, or when they overlook modern players whose less impressive-looking stats were generated in a time of stronger competition. Schoenfield’s Paradox is not necessarily a reason not to vote for a player, though it could be one. It is a reason, however, to be very, very careful when comparing the likes of, say, George Gore and Bernie Williams. Or Ned Williamson and Sal Bando.
At the Hall of Miller and Eric, we’ve struggled a bit to understand what it means to seek a chronological balance. We want to avoid falling into Schoenfield’s Paradox, over-electing and under-electing like the BBWAA and the Veterans Committee have. But it’s complicated. Very, very complicated.
I decided to take Schoenfield’s work a little further to see how much of an effect the Paradox might have. I made a few special modifications myself.
- I’m basing it on Wins Above Average (WAA) because it should be more sensitive to changes in the underlying quality of play. WAR’s replacement runs are essentially a constant applied to playing time, and will dampen the differences between players.
- I’m expressing WAA on a per-PA and per-inning basis to give our shorter-season elders a more level playing field.
- I’m analyzing a deeper player pool of nearly 600 retired batters and nearly 300 retired pitchers that Miller and I have considered.
Let’s start with the big sticks. This chart plots year of debut against a batters’ WAR/PA.
While there remain a few outliers in our age, it’s clear that more and more players cluster around a certain level of performance so that even the tier below the outliers are reigned way in.
I also created a similar chart showing only the top 100 hitters by this measure.
With just the cream of the crop we see the compression I just mentioned more closely. But there’s another observation we can make. The earliest days appear to have a fairly similar number of points to the contemporary era despite many fewer teams in the league. The chart below breaks these players into 25-year blocks by debut (and please note these are, indeed, arbitrary end points). It shows an estimate of how many of these elite batters we might expect to debut in each period given a proportionate distribution based on the number of team-seasons in that block of time. Then it compares to the actual number.
PERIOD SEASONS EXPECTED ACTUAL VAR %VAR ============================================== 1871-1895 304 14 20 +6 +45% 1896-1920 376 17 18 +1 + 5% 1921-1945 400 18 17 -1 - 7% 1946-1970 446 20 21 +1 + 3% 1971-1996 672 31 24 -7 -22% ---------------------------------------------- 2198 100 100
That certainly looks like Schoenfield’s Paradox at work. Our modern era has seven fewer players representing it among these top 100 batters than we might estimate, and the earliest era is nearly its opposite with six more than expected. The middle years are all hovering around expectation.
What About Pitchers?
The data for pitchers appear to go in the exact opposite direction of Schoenfield’s Paradox. I used WAA/IP with the 290 or so starting pitchers in my dataset.
The plot shows nearly equal amounts of compression over time with the exception of the very early game, yet pitchers appear to be getting better and increasing their performance versus their historical peers. How is that possible?
It’s the crosscurrents of the history of pitching. Today’s hurler goes as hard as he can for as long as he can then turns it over to bullpens full of power arms in the sixth or seventh inning, where his inherited runners are likely to be stranded by fresh arms. Plus he has much better defenses behind him than his historical peers. The olde tyme guys did the opposite. No matter how tired they were, they finished the game. Today we know that the third and fourth times through the order every pitcher no matter how good gets creamed. There was no bullpen to bail out the starter. Yesteryear’s iron men went the distance, pacing themselves, allowing more balls in play (to shakier defenses), bearing down to get tougher outs. Their value was driven upward more by their bulk innings versus replacement than their performance versus the league average.
Do pitchers run counter to Schoenfield’s Paradox. Let’s see how balanced things are over time on the mound. Here’s a table of the top 50 of these pitchers by era, similar to the one above for our top 100 hitters. I’ve gone to an extra decimal in the variance column to avoid rounding error.
PERIOD SEASONS EXPECTED ACTUAL VAR %VAR ============================================== 1871-1895 304 7 11 +4.4 +68% 1896-1920 376 8 9 +0.9 +11% 1921-1945 400 9 5 -3.6 -42% 1946-1970 446 10 11 +1.4 +14% 1971-1996 672 17 14 -3.1 -28% ---------------------------------------------- 2198 50 50
The variance is smaller, but swings more wildly than among the hitters. This time, the early years are overrepresented by 4 and the modern age underrepresented by 3, consistent with the hitters and perhaps with Schoefield’s Paradox. I don’t have a great explanation for the prewar era. The five pitchers who make the top 50 are Grove, Hubbell, Feller, Newhouser, and Spahn. There are 44 other hurlers from this era in my data set of 290 pitchers. That’s nine short of what we’d expect if all five eras were equal. I would guess we’re looking at the effect of World War II—it might well affect pitchers, with their much higher attrition rates, differently than batters.
As in many measurements, pitchers don’t exactly mirror hitters. It’s just not as black and white for moundsmen as it is for their opponents. We do see some evidence that the Paradox is in effect, though it isn’t quite as clear. On the other hand, though I won’t run the chart here, if you look at the Top 100 in career WAR, you’ll be shocked to see the number of second and even third tier pitchers from the early game among the greats of the game. Why? Because of the bulk innings. These innings and the piles of wins that come with them have caused voters in most Halls to either identify and choose inferior pitchers or to avoid choosing excellent modern pitchers by comparing them against these historical peers without taking context into account.
Adding It Up
Let’s combine the two tables I made for representation by era. FYI: There’s a little bit of rounding error in the Total Expected column.
PERIOD SEASONS EXPECTED ACTUAL VAR %VAR ============================================ 1871-1895 304 21 31 +10 +52% 1896-1920 376 25 27 + 2 + 7% 1921-1945 400 27 22 - 5 -18% 1946-1970 446 30 32 + 2 + 7% 1971-1996 672 48 38 -10 -20% -------------------------------------------- 2198 150 150
Here we see how much the early players are overrepresented and latter day players underrepresented. What makes this more interesting to Hall watchers is that this zeroes in on the dearth of 1970s and 1980s players that many analysts have noted.
Getting Out of the Paradox
Identifying Schoenfield’s Paradox does not mean that Miller and I should avoid electing someone from the early days. I don’t believe in timelining. Instead, I think we might exercise even greater wariness when comparing these guys to players of a more recent vintage. We might view the equivalanet of a 5.0 WAR season from 1880 differently than a 5.0 WAR season from 1980. For players on the borderline from the 1870s through the deadball era, Miller and I might want to move cautiously because there may be unavoidable illusions of context in all stats of the time that can easily color our thinking.
But believe it or not, that’s why this project is fun!