you're reading...

5 Reasons I Crave More Retrosheet Data

Great Scott, Marty! Why didn't we go back to the deadball era instead of the future???!!!

Great Scott, Marty! Why didn’t we go back to the deadball era instead of the future???!!!

The Retrosheet folks ought to get a plaque in Cooperstown. Their work is to baseball analysis what words are to literary analysis. Unfortunately, as an evolving project, they only have game-by-game, play-by-play (PBP) records back to 1945 and some types of data back to the early 1950s. Half of MLB history is still in relative darkness.

A lot of questions remain unanswerable without that data. In fact, these are the 5 that probably puzzle Miller and I the most as we sort through our many borderline candidates. The 5 the data would be most crucial to helping us answer.

  1. How good (or mediocre) were the arms of Harry Hooper, Enos Slaughter, Sam Rice, and Bob Johnson?

The problem here is that without PBP data, we only have an outfielder’s assists totals to work with. We don’t know how many base runners he prevented from taking an extra base. In right field, assists totals are suppressed once a strong-armed outfielder’s reputation is made. In left, however, weak arm outfielders rack up more assists than their arm strength ought to suggest because they are tested more often. PBP-based information on holding base runners would balance out that paradox. How does that apply to these four guys?

  • Johnson’s in, of course, but he’s borderline. He has nearly unparalleled assists totals for his position. PBP would help validate the likely awesomeness of his arm.
  • Hooper is just off the end for both of us. If his fabled arm is as good in the PBP metrics as it is by reputation and by the implication of his strong assists totals, the value added to his case could easily push him over.
  • Enos Slaughter’s case is much more mixed because the play-by-play we have for the last quarter of his career is at odds with is reputation and his assists totals prior to the availability of the data. But he’s close enough for Eric that it could make a difference.
  • Rice would need every ounce of arm value he could get, but he’s close enough to be a factor too.
  1. Just how great a base runner was Max Carey? Just how bad a base runner was Ernie Lombardi?

Base running value is kinda-sorta the inverse of outfielders’ throwing value. With only stolen base data to work with, we don’t know how many extra bases players took. This is especially important for deadball players who likely took more chances on the bases. Carey’s stolen base record stands out from what we know about his times in not only its volume but in his outstanding success rate. Were his ball-in-play advancements just as impressive? If so, he could be several wins more valuable than we currently think. Meanwhile, Ernie Lombardi is almost universally considered the slowest player ever to put on cleats. Yet his BBREF base running value is +5 runs. Does not compute. Speed-reliant aspects of his batting stats (double-play frequency, doubles and triples, stolen-base attempt rate) are completely at odds with this figure as are every story ever written about him. He’s no slam dunk, so resolving this would be awfully helpful.

  1. How much value did ace starters like Miner Brown and Lefty Grove add by doubling as relief aces?

In Grove’s case it doesn’t matter much, but in Brown’s case, Eric’s vote was strongly influenced by an outstanding record of relief work. But what kind of situations did their own managers use them in? How high was the leverage? How much difference would it make to their cases? There’s dozens of other candidates affected by the answer to this question, particularly Dizzy Dean or even Chief Bender.

  1. How good were Firpo Marbury and Eddie Rommel in relief, and how were they deployed?

Marbury may or may not have been the first “modern” relief ace, but he was a sensation during his career. How many wins did he contribute? Just how much leverage did he pitch in? Rommel, on the other hand, has a somewhat mysterious relief record. Grove was used as a relief ace, but Rommel seems to have been the first alternative to Grove. Lefty got 50 saves with the A’s, Rommel 30 (they overlapped for much of the 1920s and early 1930s). BBREF WAR thinks highly of Rommel, Fangraphs WAR not so much. But neither knows much about his value in the pinch.

  1. Could Ned Williamson really have been that bad at shortstop?

Reaching wayyyy back in time here. Williamson is in the mix at third base, where he spent most of his career as an amazing defensive third baseman. He was moved to short later in his career and sucked. I mean, Cape Man level epic failure. At least according to the defensive stats we have now. I’d like to know, however, whether PBP agrees with their assessment. See the thing is that in the first fifty to sixty years of the game, third basemen occupied the spot on the defensive spectrum that second basemen occupy now. Where today we think of second baseman as being nearly as nimble as shortstops, back in Williamson’s day, before double plays became frequent, third basemen were like second shortstops. This model is part of the reason why there are so few great third basemen between Williamson and Mathews. Point here being that while it makes sense that Williamson wouldn’t be as good at short as he was at third (moving to a tougher position when older), did he really go from an A fielder to F? I could see A to C, after all. Or is there something in the PBP that we don’t yet know about?

Of course, there are much broader questions we could get answered and that would provide us with a whole new perspective on how the deadball game was played, what the pinball 1890s were like, and how the early game evolved. With deeper PBP data we might uncover a couple stars we didn’t quite have a handle on, and we might see some stars’ lights dim. Or maybe we’ll just get confirmation of our own biases. That happens a lot too.

I don’t know if getting more of this information is even possible. I do know this, however: Retrosheet will get it if it is possible to get.




2 thoughts on “5 Reasons I Crave More Retrosheet Data

  1. Even I’m not old enough to remember Lombardi as a player, but I recall that in the ’50s that TV would frequently show the Old Timer’s game at Yankee Stadium before the regular game. One year Lombardi clobbered a pitch (at least clobbered for a guy in his 50s) and took forever to reach first. Dizzy Dean, doing the play-by-play, commented “He hasn’t lost a step.”
    I wonder if Lombardi was considered so slow that universally teams simply ignored him on the bases, allowing him to get an extra base here and there. Just a thought (maybe not a good one)
    BTW, like you, I love Retrosheet.

    Posted by verdun2 | June 16, 2014, 8:12 am
    • V2, great point about Schnozz being ignored. Even a blind squirrel will find an acorn, or whatever the saying is. His final two seasons do fall under PBP data, and he grades out positively on the bases in those years, so anything is possible!

      Posted by eric | June 17, 2014, 9:32 pm

Tell us what you think!

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Institutional History

%d bloggers like this: