you're reading...
Sidebars, Uncategorized

Introducing Contemporary Similarity Scores for Batters

[Download the contemporary-sim-scores-batters Excel tool.]

[Editorial note: This article was written several weeks ago. In the time since, I stumbled upon something I completely forgot about that I’d read years ago. Adam Darowski at The Hall of Stats has already created WAR-based sim scores. We decided to only gently edit the post, even though it makes it sound like we got there first, and to add this note up top. We apologize to Adam for our unwittingly blundering into his territory. We we also make haste to note that our implementation is different from his in its details and its presentation. Also different is the fact that we are providing you with a customizable sim-scores tool to make your own. And anyway, we’re all just stealing Bill James’ idea anyway…like usual. If you have any questions, concerns, or comments about this, please put them in the comments.]

I used to love similarity scores. I remember about fifteen years ago, excitedly anticipating the day when BBREF would run its annual sim-score update. Once it occurred, I’d pound names into the search blank for hours, hundreds of names, and scroll all the way to the bottom of all those players’ pages to see whether Bernie Williams’ numbers now resembled a Hall of Famer’s or not. Did Miguel Cabrera resemble a young Frank Robinson? We could dream Hall of Fame dreams onto ballplayers old and young.

Then Wins Above Replacement started to take hold, and I didn’t care about sim scores anymore. With WAR, the veneer of surface stats lost its shine because they conveyed far less meaning. For me they sunk to something amusing, but with a poor signal-to-noise ratio.

Today, at the Hall of Miller and Eric, we will reclaim sim scores for the WAR generation. We’ve created a new variation that we hope transforms them into something genuinely informative for discussions of great players. Before we get into our new take on it, let’s nail down why similarity scores need rescuing. We’ll ask Hack Wilson for assistance.

Scratching the Surface Stats

On BBREF, Hack Wilson’s ten most similar batters are Wally Berger, Ken Williams, Hal Trosky, Jeff Heath, Larry Doby, Mike Trout, Carlos Gonzalez, J.D. Drew, Chick Hafey, and Pedro Guerrero. Berger, Heath, Hafey, Guerrero, and Trosky make sense: big hitters with medium-short careers and maybe not the greatest defensive chops. Things get strange when we reach Doby and laughable with Trout. Let’s stick with Doby.

Larry Doby and Hack Wilson may have similar counting stats, but we know that they didn’t go about winning games in the same way. Doby had a high degree of athleticism, Wilson didn’t. Doby had some speed and ran the bases well, Wilson didn’t. Doby didn’t embarrass himself in centerfield, Wilson did. The difference of style leads to one of substance. Hack Wilson’s impressive .307/.395/.545 translates to 303 batting runs above average (Rbat) and eventually to 21 Wins Above Average and 38 WAR—good but hardly Hall worthy. With superficially similar counting stats and an inferior slash line of .283/.386/.490, Doby earned 267 Rbat and eventually 30 WAA and 49 WAR. In reality, Larry Doby isn’t comparable to Hack Wilson at all; in fact, Doby is a significantly better player. The sources of Doby’s extra wins over Wilson get at why similarity scores went from something useful to something merely fun:

  • They do not take fielding performance into account at all
  • They are not adjusted for the league’s offensive context
  • They are not park adjusted
  • They barely take baserunning and speed into account.

So, naturally we will get some pretty buggy results. That’s OK because they weren’t invented for precision.

Bill James introduced similarity scores in the 1986 edition of The Baseball Abstract. What Bill wanted to know was with the stats any fan has at hand, how do we find out who a player is similar to? He expanded on sim scores in the classic The Politics of Glory (also known as Whatever Happened to the Hall of Fame?). In an example, he explained that while Harmon Killebrew is the most statistically similar player to Willie McCovey, he thought that Willie Stargell was a better comp for “Stretch” on more stylistic/player-type terms. He never believed a comp list itself could answer the question of similarity without deep interpretation because he knew its limitations. With today’s value stats, a lot of the interpretative work is done for us.

Below the Surface

James’ system for identifying comps is ingenious in its simplicity. To compare player A to player B, start with a score of 1,000. Then adjust for position. Then subtract points from player B for the difference between his numbers and A’s in various baseball-card stats. And for a long time, it was a good enough way, and maybe the only accessible way thanks to BBREF, to look at similar players.

Similarity scores lost their usefulness in Hall of Fame arguments because WAR told a more complete story in shorter, more digestible chapters. But WAR is an end in itself. A player netted 58 WAR, that’s how much he’s worth. Kinda dry by itself, and we lose some of the texture of a man’s career. A player with 58 WAR who hit leadoff is a very different player than one who never stole a base and hit 500-foot home runs. Within the WAR framework, however, there are tools to tell players like these apart. In fact, to understand how a player contributed to his teams, we need to set aside WAR as a reference point and switch to wins above average. WAR has a lot of runs built into it that are kind of for showing-up and that don’t tell us much about a player compared to his league. WAA is where the action is.
The theory and calculation of WAA is straightforward theoretically:

Batting above average + Baserunning above average + DP avoidance above average + Fielding above average + A positional adjustment expressed relative to average = Runs Above Average

Convert those runs to wins and you’ve got WAA.

The devil is in the details….

If you look at the “Player Value – Batting” area of any player page at BBREF, you’ll see exactly what I just described. There’s Rbat, Rbaser, Rdp, Rfield, Rpos, RAA, and WAA. Larry Doby created 267 more runs than a league-average hitter (Rbat). He created 15 more runs than an average baserunner (Rbaser). He avoided so many double plays that he saved his team 21 runs above an average hitter (Rdp). He saved 13 more runs on defense than an average centerfielder (Rfield), and during his career centerfielders had a positional value about 1 run below average per annum (Rpos). All that adds up to 303 Runs Above Average, which converts into 30.3 WAA. We now know exactly how Doby pushed his teams toward wins:

  • He was a very good hitter, though not great
  • He ran well but wasn’t a speed merchant
  • His speed, handedness, and probable fly-ball proclivities helped him avoid lots of double plays
  • He fielded his position well though not at a Gold Glove level.

Who else played the game like Larry Doby? If we used trad-stats sim scores, we wouldn’t think of him as a great player: J.D. Drew, Dolph Camilli, Bill Nicholson, Andrew McCutchen, Eric Davis, David Justice, Rudy York, Kirk Gibson, Ray Lankford, and Raul Mondesi. With the exceptions of McCutchen and Nicholson, these guys are not all that comparable to Doby’s overall level of excellence—he is a notch or three above them.

Here’s another list, this time based on what we know about how Doby’s skills translated to value: Earl Averill, Bill Nicholson, Augie Galan, Earle Combs, Raul Mondesi, J.D. Drew, Dixie Walker, Kirby Puckett, Tommy Henrich, and Duke Snider. Averill and Puckett are strong adds and occupy a similar place in the pantheon of centerfielders as Doby does (if you don’t include any Negro League credit for him). Snider shows the upside. Maybe you disagree, but the way these players built their value resembles the results of Doby’s game more closely than any other players in big league history. Earl Averill is highly comparable to Doby because they have the same running and DP value, ended up within 1,000 plate appearances of one another, and played the same position. They diverge a bit on batting. Averill’s Rbat is over 300, Doby’s about fifty runs less, however, the run contexts of their respective leagues shrink much of that gap, and the run contexts are accounted for in the wins above average calculation. Kirby Puckett is within about 30 runs in batting to Doby and very close in running, and, among all the comps, the overall value of his contributions is closest to Doby’s It’s a lot easier to find the story in the numbers with WAA’s components than it is to go back through every category in the traditional similarity score and make mental adjustments for park, league, and position.

The comps resulting from WAA-component sim scores will tell us something a little different, and, I think, more helpful, than trad-stats sim scores do. Unlike the sim scores on BBREF, these component sim scores will express similarity in terms of the substance (or amount of value) and shape (relative importance of batting, running, DP-avoidance, and fielding to his game) of a player’s contribution to winning, not merely the style of batting he used to get there. Because WAA components are denominated in runs above average, we can use these sim scores to learn more about cross-generational players in very different contexts. With trad-stat sim scores, we’d never know that Sam Crawford and Reggie Jackson are similar players because their counting stats, influenced by the times when they played, were so different. No, Crawford didn’t hit 500 homers, nor did Reggie hit 300 triples, but their offensive exploits, as Rbat shows, added up to roughly the same value above average. They were both premier power hitters in their time with some defensive shortcomings. By the way, they are each other’s most similar player using runs-based components (962).

Contemporary Sim-Score How-To

For the remainder of this article, I’m going to differentiate between “traditional similarity scores” (the ones you find on BBREF that James created) and “contemporary similarity scores.” The contemporary sim score calculation is precisely like James’ original only with different inputs. In comparing player B to player A, we adjust for position first. We’re going to do it exactly like James does it rather than use WAA’s Rpos component. That way we can flip players between positions manually to get more distinct comp sets. This becomes operative for players who split their careers between two or more stations on the field: Rod Carew, Ernie Banks, Robin Yount, Reggie Smith, Paul Molitor, Andre Dawson, Pete Rose, Harmon Killebrew, et al. Next we’ll deduct points from B for differences in Rbat, Rbaser, Rdp, and Rfield, as well as for differences in plate appearances and WAA. Including WAA helps us avoid trap of decontextualization. We want to ensure that we can, at some level, compare players in divergent run environments on more equal terms, and the WAA calculation takes into account whether runs were cheaper or more precious in a player’s day.

We start with a dataset comprising the 3,855 position players with at least 1,000 career plate appearances and a couple pitchers who played a lot in the field. Pitching value is not included. Then we kick it off. To show you how we’re doing this, we’ll us an example of two players fans may find analogous: Eddie Murray and Rafael Palmeiro. Murray and Paleiro were both first baseman. Both had very long careers of similar length. Both hit 500 homers. Both reached 3,000 hits. Both played good defense. Both played a long time for the Orioles. Though only one of them testified to Congress….

Positional adjustment: Just as James directed, we find the absolute value of the difference between the positional value of Murray’s primary position and Palmeiro’s (see below) then multiply by 12.
Position Values
C: 20
SS: 14
2B: 11
3B: 7
CF: 5
RF: 4
LF: 3
1B: 1
DH: 0

Murray and Palmeiro both played first base primarily, so their positional adjustment is ( 1 – 1 ) x 12 = 0

Palmeiro’s similarity score with respect to Murray remains 1000.

Scaling the categories: The maximum adjustment for position is 240 (catcher’s positional value minus a DH’s times 12), which leaves 760 points for the other categories. I’ve decided to weigh them equally, and in the linked worksheet you can reweigh them if you’d like. There are six other categories: batting runs, baserunning runs, double-play avoidance runs, fielding runs, wins above average, and plate appearances. Dividing 760 by six, we get about 127 points per category.

Scaling the deductions: James called the points removed from the B player in a comparison “penalties,” so but I like the slightly more neutral sounding deductions. To determine the deduction rate for each category, we’ll find the difference between the tenth highest total in that category and the tenth lowest and divide by the 127-point-per-category scale we determined a moment ago. Why use the tenth highest and lowest figures in each category? To avoid extremes. The very best and worst players skew the results. If you know how to use the LARGE and SMALL formulas in Excel, in the linked spreadsheet way up top, you can change to the highest and lowest ranked players or any other pair of players you’d like.

Here’s what the deduction rates ends up looking like:

  • Rbat: Subtract one point from player B for every 8.5 runs of difference
  • Rbaser: Subtract one point for every 0.9 runs of difference
  • Rdp: Subtract one point for every 0.5 runs of difference
  • Rfield: Subtract one point for every 2.4 runs of difference
  • WAA: Subtract one point for every 0.8 wins of difference
  • PA: Subtract one point for every 91.5 PA of difference.

For the purposes of each category, which are worth 127 points each, 8.5 runs of batting is the same as 0.9 runs of baserunning. The categories are then all balanced, or you might say they all have a weight of 1.0. You can tinker with these weights and adjust them to your taste in spreadsheet linked at the top of this post.

Applying the deductions: Find the absolute value of the difference between two players in a given category then divide it by the appropriate deduction rate.

Rbat: Murray created 392 runs above average with his bat; Palmeiro created 431. The difference is 39 runs, and the deduction rate is 1 point subtracted per every 8.5 runs of difference: 39 / 8.5 = 4.6 points. Knock that off of Palmeiro’s current 1,000 score for a new total of 995.

You get the idea, so I’ll cut to the chase. Murray and Palmeiro have one run of difference in baserunning, three runs of difference in DP avoidance, thirteen runs difference in fielding, three wins difference in wins above average, and a 771 difference in plate appearances. These are remarkably small differences for players with more than 12,000 plate appearances each. Putting it together, Palmeiro’s contemporary similarity score for Murray is 969. (Murray and Palmeiro are one another’s top comps in traditional sim score as well at 885.)

Interpreting contemporary similarity scores: With fewer categories and somewhat less variance within each category than counting stats, contemporary similarity scores are on a slightly different scale than traditional scores. Depending on the players involved, scores will come out 25–50 points higher than the trad scores on BBREF. More players will come out with high similarity ratings, so adjust your expectations.

Using contemporary similarity scores: Bill James emphasizes in The Politics of Glory that the weight of the evidence is of primary importance in constructing a credible case for a Hall. That includes similarity scores. If a player’s list of most similar players contains a lot of Hall members, it says something important, though not definitive about that player. If the player is superior to, inferior to, or smack in the middle of his sims that, too, says something important. While we know Eddie Murray is a Hall-level player, do contemporary similarity scores support the idea? Yes, of course. His top ten in rank order include Palmeiro, Todd Helton (941), Jake Beckley (925), John Olerud (911), Zack Wheat (907), Sammy Sosa (904), Fred McGriff (904), Dwight Evans (900), Al Simmons (896), and Fred Clarke (892). For me, Murray rates as just over the in/out line at first base, and these comps absolutely support that ranking. They include seven members of the Hall of Miller and Eric, most of whom are lower-quartile inductees at their position. The comps also include three players very close to the borderline (Beckley, Olerud, and McGriff). All told, the list suggests Murray’s downside is being among the first guys outside the Hall and his upside being about the fifteenth best at his position.

Again, we hope you’ll play with the spreadsheet linked at the top of this article and hunt down some cool comp lists of your own.

Test Driving Contemporary Similarity Scores

Let’s put the contemporary sim score through its paces by picking up where Bill James left off in The Politics of Glory. In his chapter on similarity scores, he examined a handful of borderline batters. We’ll use those guys as the basis for comparing and contrasting the results the traditional and contemporary systems return. Asterisks indicate that the player is a member of the Hall of Miller and Eric.

Johnny Bench

1  Yogi Berra*    (907)   Gabby Hartnett* (955)
2  Gary Carter*   (882)   Gary Carter*    (925)
3  Carlton Fisk*  (866)   Carlton Fisk*   (923)
4  Mike Piazza*   (828)   Bill Dickey*    (922)
5  Miguel Tejada  (808)   Buster Posey    (912)
6  Aramis Ramirez (799)   Joe Mauer       (908)
7  Jorge Posada   (797)   Buck Ewing*     (900)
8  Dale Murphy    (797)   Ivan Rodriguez* (896)
9  Ron Santo*     (794)   Russell Martin  (891)
10 Matt Williams  (793)   Yogi Berra*     (887)

In common: Berra, Carter, Fisk
Traditional scores show their weaknesses here. Because Bench hit like a cornerman, he ends up compared to several players at other positions who are not remotely comparable. Two of the catchers he’s comped against had terrible stolen-base success rates against them, unlike the cannon-armed Bench. The contemporary sims stick with backstops, spit out seven of the top ten catchers of all time, all of whom had good gloves, and generally reads like a useful list. What does Hartnett’s comparability to Bench look like? Acknowledging that we’re not yet in possession of all Gabby’s baserunning and DP value, here’s the direct comparison:

         Rbat  Rbaser  Rdp  Rfield  WAA   PA
Bench     269     -2   -15    72     47  8674 
Hartnett  232    -10   -16    78     36  7297

Pretty damned close.

Willie McCovey

1  Fred McGriff     (887)   Jim Thome*       (942)
2  Willie Stargell  (884)   Willie Stargell  (940)
3  Harmon Killebrew (880)   Jason Giambi     (934)
4  Jason Giambi     (873)   Lance Berkman    (927)
5  Eddie Mathews*   (858)   Carlos Delgado   (923)
6  Paul Konerko     (846)   Jesse Burkett*   (917)
7  Carlos Delgado   (844)   Harmon Killebrew (917)
8  Mike Schmidt*    (841)   Sam Crawford*    (915)
9  Jose Canseco     (821)   Jim Bottomley    (908)
10 Jim Thome*       (813)   Fred McGriff     (907)

In common: Fred McGriff, Willie Stargell, Jason Giambi, Carlos Delgado, Jim Thome, Harmon Killebrew
While McCovey was different esthetically in many ways from his comps, these are nonetheless the players he belongs with: Big sluggers who couldn’t win a footrace against a lamp post. The fact that eight of these players are lefthanders like McCovey is purely but interesting coincidental. The differences in the systems once again show the relative strengths and weakness of the approaches. Mike Schmidt, Eddie Mathews, and Jose Canseco are very poor comps for McCovey. Schmidt was an athletic, Gold Glove third baseman who might have played shortstop in modern times. Mathews could field, and he didn’t clog up the bases. Canseco had speed early in his career whereas McCovey never did. On the other side of the ledger, Sam Crawford, as we mentioned earlier, packed Reggie Jackson’s offense and, unfortunately, Reggie’s glove too. The nature of Wahoo Sam’s contributions are very similar to McCovey’s. Jesse Burkett feels like a bit of a reach, but his WAA-component stats fit this particular mold. Bottomley, well, he’s not a great comp.

McCovey’s comps, according to my own rankings, are generally lesser players than he (Thome, Burkett, and Crawford being the exceptions). What does this say about McCovey then? First off, it suggests that he may be at the head of a family of players who are of the type I noted a moment ago: Slow, poor-fielding, slugging first basemen. This is, of course, a common stereotype for first basemen in the post-Sisler era, and perhaps especially after the second World War. McCovey and/or Thome are the sort of Platonic ideal of this type of player. Being better than your comps may have other subtle meanings. As we’ll see with Bobby Grich in a few moments, sometimes being superior to your most similar matches indicates a degree of uniqueness or of an ability to combine skills in a way that’s unusual. Of course, it’s generally a good thing to be the best player on any list at all, comp lists included. In McCovey’s case, this counts for a win.

Johnny Kling and Heinie Peitz
James noted in The Politics of Glory that Kling and Peitz were the most similar players he’d come across, so I followed up on them with contemporary similarity scores. Kling’s best contemporary comp is Manny Sanguillen (973) with Peitz (970) a close second. However, Peitz’s closest contemporary comp is Ivey Wingo (966). Peitz and Wingo at are tied for the closest match I’ve found so far along with several pairs of short-career corner guys. Johnny Kling is just off Peitz’s list (970).

Reggie Smith and Fred Lynn
James also showed in The Politics of Glory that Reggie Smith and Fred Lynn were unusually similar players in his system. Let’s break them both out here.

1  Fred Lynn     (959)   Dwight Evans*  (945)
2  Shawn Green   (945)   Joe Medwick    (939)
3  Brian Giles   (926)   Minnie Miñoso* (938)
4  Bobby Bonilla (924)   Chet Lemon     (936)
5  Ellis Burks   (913)   Sherry Magee*  (934)
6  Matt Holliday (908)   Elmer Flick*   (933)
7  Derrek Lee    (906)   Bobby Bonds*   (932)
8  Del Ennis     (905)   Al Simmons*    (931)
9  Paul O’Neill  (905)   Hugh Duffy     (928)
10 Jermaine Dye  (902)   Tony Oliva     (925)

In common: None
Smith’s athleticism doesn’t come through among the traditional comps. The contemporary sims ring truer with more speed and athleticism and a raft of good gloves to match Smith’s +78 leather work. Additionally, the overall quality of the players improves markedly in the WAA-components-based list. We’ve inducted Smith into the Hall of Miller and Eric, as have the Halls of Merit and Stats. He merits better comps than trad stats can give him. Incidentally, this list is based on Smith as a rightfielder. If we ran him as a centerfielder, Evans moves to fourth and Lemon up to first. Lynn does not appear.

1  Reggie Smith* (959)   Kirby Puckett     (970)
2  Bobby Bonilla (940)   Ellis Burks       (966)
3  Shawn Green   (937)   Andrew McCutchen  (961)
4  Brian Giles   (920)   Jim Wynn*         (958)
5  Derrek Lee    (916)   Edd Roush         (956)
6  Raul Ibanez   (916)   Darryl Strawberry (953)
7  Jermaine Dye  (912)   Jim O’Rourke*     (948)
8  Paul O’Neill  (911)   Cy Williams       (947)
9  Greg Luzinski (910)   Earle Combs       (947)
10 Del Ennis     (910)   Wally Berger      (944)

In common: None
Unsurprisingly since the trad-stats version sees Smith and Lynn so similarly, they have mostly similar comps, and again the comps don’t do justice by Lynn. The contemporary version, on the other hand, has more dynamic players, though it may overrate the company Lynn should keep. A note: The presence of O’Rourke reminds me that Nineteenth Century batters’ playing time is not prorated to 154 or 162. In fact, no one’s playing time is changed. O’Rourke was a better player than Lynn; he simply played when schedules were shorter. If we swapped him out, George Van Haltren (944) would jump onto Lynn’s list.

Phil Rizzuto

1  Art Fletcher*  (919)   Scott Fletcher    (951)
2  Billy Rogell   (916)   Terry Turner      (945)
3  Elvis Andrus   (913)   Travis Jackson    (944)
4  Billy Jurges   (910)   Jose Valentin     (940)
5  Claude Ritchey (909)   Dave Bancroft*    (936)
6  Lonny Frey     (907)   Joe Tinker*       (934)
7  Marty Marion   (906)   Art Fletcher*     (933)
8  Doggie Miller  (906)   Marty Marion      (928)
9  Lyn Lary       (905)   Roger Peckinpaugh (925)
10 Jose Offerman  (904)   Freddy Parent     (922)

In common: Art Fletcher, Marion
When I first glanced at these, I didn’t feel it, and I suspected that Lonny Frey most closely resembled a Rizzuto-type player: slightly below average bat, good to great running, and outstanding fielding. Actually, Frey contributed 60 more batting runs and 20 more DP runs than Rizzuto, but his defense was half as valuable as Scooter’s. The overall value may be similar, but the shape of their contributions and their positions were just different enough to keep Lonny off the guest list. (If Frey were a shortstop, however, he’d be Rizzuto’s fifth most comparable player). The Rizzuto rooters out there will take heart that the caliber of player returned by the contemporary method is a notch or two higher than the trad-stat matches. Some of the guys on the contemporary list had an MVP-type season in them and (would have) popped out some All-Star-level seasons just as Rizzuto did. Other than Art Fletcher and a season or two of Frey’s the trad-stats guys generally don’t play up to that level. They underrate Rizzuto where the contemporary comps place him below the line but include enough lower-quartile Hall players to show a more appropriate range of peers.

Dick Allen

1  Ryan Braun    (935)   Jason Giambi    (915)
2  Lance Berkman (903)   Hank Greenberg* (908)
3  Reggie Smith* (894)   Roy Sievers     (903)
4  Ellis Burks   (891)   Matt Holliday   (901)
5  Brian Giles   (890)   Johnny Mize*    (894)
6  Nelson Cruz   (885)   Bob Watson      (892)
7  Jermaine Dye  (881)   Brian Giles     (891)
8  George Foster (880)   Frank Howard    (889)
9  Fred Lynn     (876)   Mark McGwire*   (888)
10 Tim Salmon    (876)   Ryan Braun      (882)

In common: Braun, Giles
Allen befuddles both systems, and I think it might stem from his hybrid first-base/third-base career and his run-starved hitting context. Contemporary sim scores bring players like Greenberg, Mize, and McGwire forth as comps. Because we rate Allen among the top fifteen to twenty first basemen in history, the appearance of those players makes good sense to us and leads us to believe the contemporary comps may provide more useful information in discussing Allen. If we used third base instead of first, his comps don’t make a lot of sense (top comp: Bill Madlock). It’s because no one who played third base for a long time and generated as much offense as Allen did (435 batting runs) stuck at the hot corner with a glove as poor as Allen’s.

Richie Ashburn

1  Brett Bulter     (911)   Jose Cruz*        (935)
2  Lloyd Waner      (894)   Enos Slaughter*   (935)
3  Doc Cramer       (878)   Curtis Granderson (916)
4  Harry Hooper*    (854)   J.D. Drew         (910)
5  George Burns1    (853)   Vada Pinson       (909)
6  Stan Hack        (851)   Andre Dawson*     (904)
7  Fred Tenney      (849)   Wally Moses       (899)
8  Jimmy Sheckard*  (843)   Larry Doby*       (898)
9  Willie Wilson    (842)   Willie Davis*     (897)
10 Charlie Jamieson (833)   Dixie Walker      (895)

In common: None
Butler and Ashburn always seemed like highly similar players to me. Both had speed, both walked a lot, neither could leave the park without a car. Those things remain true, but a direct comparison of the structure of their value tells a more nuanced story:

         Rbat  Rbaser  Rdp  Rfield  WAA   PA  SIM SCORE
Ashburn   205      8    31    77     29  9736 
Butler    188     37    33   -84     21  9545    877

Their offensive similarities carried undue weight in my mind, and that’s understandable. I watched Butler play while growing up, and I had no solid way to know whether he played good defense. I had even less help with his baserunning; all I could do was look up his stolen base totals. With better information, I can now see that while they do, indeed, rate as extremely similar at the plate (Rbat and Rdp), Ashburn didn’t run the bases nearly as effectively as Butler, and Butler couldn’t hold a candle to Ashburn defensively. Twins in my mind, more like mere siblings or first cousins in reality.

The trad-stat list turns up some very odd names. Lloyd Waner only walked about once a week and hit an empty .300, same for Doc Cramer to a slightly lesser degree. Hooper and Burns figure in because Ashburn hit like a deadball player. Very odd to see Fred Tenney, a first baseman, matched to Ashburn, and Charlie Jamieson seems to come from nowhere. I actually like Hack and Sheckard very much as comps. Had Ashburn played third base, Hack would be his fifth most comparable player.

The contemporary sim score players feel a little scattered to me. Some power-hitting centerfielders then Dixie Walker, Willie Davis, and Jose Cruz. But again, I find myself thinking about how players look, the style components of their game. Does that actually matter? If I argue for Brett Butler that he looked like Rich Ashburn, is that an effective argument in favor of electing Butler to the HoME? Not likely: It’s too laden with context and personal perceptions to be useful. If I argue for Ashburn that he used his small-ball game to create the same amount of value that five other members of the Hall of Miller and Eric did, is that a useful piece of evidence in his favor? Yes, I think it is.

Doc Cramer

1  Richie Ashburn*  (878)   Lloyd Waner   (904)
2  Nellie Fox       (867)   Ruben Sierra  (894)
3  Harry Hooper*    (866)   Gus Bell      (888)
4  Max Carey*       (851)   Juan Pierre   (885)
5  Lloyd Waner      (845)   Enos Cabell   (880)
6  Lave Cross       (845)   Sam West      (879)
7  Brett Butler     (842)   Dave Philly   (879)
8  Willie McGee     (839)   Brian McRae   (879)
9  Paul Hines*      (834)   Rick Manning  (877)
10 Red Schoendienst (832)   Willie McGee  (876)

In common: Waner, McGee
These two lists provide a fascinating contrast between style and substance. Cramer’s play, all those hits he racked up, added up to a whopping -19 wins above average. That’s really bad for a player this close to 10,000 plate appearances. Cramer couldn’t hit—he didn’t draw walks or hit for power—couldn’t field, ran the bases about as well as an average player, but did well at avoiding double plays. His reputation for footspeed suggests to me that, in the vein of Bernie Williams, he had difficulty turning his raw speed into baseball skills. Using traditional stats gets a comp list that matches Cramer’s lack of power and his speedy reputation. On the other hand, Hooper and Carey played in the deadball era and were far greater players than Cramer. Hines played in the 1870s and 1880s when homers were infrequent. Lloyd Waner is the closest thing to a similar hitter to Cramer: An empty .300 batter with negative value against average in his career. Neither side sees all that many truly similar players to Doc Cramer, but look at how different the players in the list on the right are. Ruben Sierra? Think of him as an empty .300 hitter who trades singles for homers. He isn’t nearly as awful a hitter as Cramer, but he had huge holes in his game because of his approach to the strike zone, just like Cramer did (they were both terrible fielders too). This example boils down to the esthetic comps of traditional similarity scores or the results-oriented comps of contemporary similarity scores. And those esthetics aren’t really that accurate in the first place.

Dwight Evans

1  Luis Gonzalez   (904)   Zack Wheat*    (948)
2  Torii Hunter    (890)   Reggie Smith*  (945)
3  Chili Davis     (885)   Fred Clarke*   (942)
4  Billy Williams* (870)   Sammy Sosa*    (941)
5  Bobby Abreu     (865)   Andre Dawson*  (941)
6  Tony  Perez     (862)   Al Simmons*    (938)
7  Dave Parker     (854)   Goose Goslin*  (937)
8  Carlos Beltran  (851)   Willie Keeler* (935)
9  Darrell Evans*  (848)   Joe Medwick    (925)
10 Al Kaline*      (847)   Paul Waner*    (923)

In common: None
The trad-stat comps underrate Evans’ place in history just a little bit. He appears to straddle the in/out line with several Hall-level players and several Hall of the Very Good players surrounding him. His contemporary-score counterparts reveal a player who merits induction into whatever Hall you got. In context, Evans racked up more than 400 runs above an average player, a total that screams Elect me! Or would to voters who paid attention to anything beyond career counting-stat totals. Evans belongs in great company and gets his due from the WAA-component method.

Bobby Grich

1  Toby Harrah        (909)   Bobby Doerr*     (937)
2  Brandon Phillips   (898)   Joe Gordon*      (927)
3  Jay Bell           (895)   Dustin Pedroia   (925)
4  Bret Boone         (893)   Billy Herman*    (916)
5  Jhonny Peralta     (884)   Lou Boudreau*    (914)
6  Chase Utley        (883)   Ian Kinsler      (913)
7  Sal Bando*         (880)   Robinson Cano    (913)
8  Ian Kinsler        (876)   Cupid Childs*    (906)
9  Travis Fryman      (872)   Hardy Richardson (904)
10 Ray Durham         (869)   Johnny Evers     (902)

In common: Kinsler
Can no justice be done for Bobby Grich? The traditional comps dramatically undersell his candidacy. The neo-stats comps merely undersell it. Grich has an unusual profile. Keystone men who hit about as well as Grich don’t field nearly as well as he did. Second basemen who field as well as Grich don’t hit nearly as well as he did. It’s easier to see in a table:

          Rbat  Rbaser  Rdp  Rfield  WAA   PA  SIM SCORE
Grich      256      4   -11    82     44  8220    ---
Doerr      144     -2   -12    43     27  8028    937
Gordon     141      5    -9   150     37  6535    927
Pedroia    130      7    -3    97     29  6777    925
Herman     148      8     3    55     27  8638    916
Boudreau   194     -3    -8   118     42  7025    914
Kinsler     80     39   -11    99     30  8299    913
Cano       282     -6    -1     5     37  9264    913
Childs     246      0     0    28     22  6766    906
Richardson 214      0     0    54     21  6044    904
Evers       80      4     0   127     24  7220    902

Grich generated more value than all of his most similar players because he alone among them combined a strong bat with a strong glove. To find players who managed to do the same, you have to look upward in the all-time rankings at second base for names like Lajoie, Collins, and Gehringer. It says something noteworthy when a player is superior to all his nearest comps. In Grich’s case it says he occupies a uniquely important place among second basemen.

Charlie Grimm

1  Ed Konetchy        (908)   Bill Buckner     (975)
2  Joe Judge          (897)   Jim Spencer      (946)
3  Stuffy McInnis     (892)   Stuffy McInnis   (942)
4  Hal Chase          (884)   Ed Kranepool     (941)
5  Joe Kuhel          (883)   Chris Chambliss  (938)
6  Wally Pipp         (882)   John Mabry       (931)
7  Willie McGee       (877)   Dots Miller      (929)
8  Chris Chambliss    (875)   Adam LaRoche     (927)
9  Phil Cavarretta    (873)   Tino Martinez    (925)
10 George Burns2      (864)   Lyle Overbay     (925)

In common: McInnis, Chambliss
Although the fellows in the trad-stat column look like good matches, they lean very heavily toward deadball first basemen, a guy who played much of his career in below-average to historically neutral run contexts (Chambliss), and another who hit in similar contexts as Chambliss but in a pitcher’s park on a team that played like deadballers (McGee). In the righthand column, the matches show a wider variety of eras. Grimm has a very unusual profile for a long-tenured first baseman: Below average batter, slightly above average fielder and nothing really distinguishing about him. His career lasted a couple-few thousand plate appearances longer than it should have. Hey, that sums up Buckner pretty well too.

Stan Hack

1  Harvey Kuenn    (897)   Fred Lynn         (936)
2  George Burns1   (885)   Darrell Evans*    (931)
3  Buddy Myer      (883)   Bob Elliott       (930)
4  Paul Hines*     (875)   Deacon White*     (930)
5  Wally Moses     (874)   Curtis Granderson (926)
6  Jim Gilliam     (867)   Sal Bando*        (921)
7  Jimmy Collins*  (859)   Kirby Puckett     (923)
8  Placido Polanco (857)   Edd Roush         (919)
9  Billy Herman*   (853)   Heinie Groh*      (919)
10 George Kell     (853)   Buddy Myer        (917)

In common: Myer
Hack’s not far from the in/out line, and the contemporary sim-score list gets at that by showing three of the lowest ranking HoMErs at third base.

Babe Herman

1  Bob Meusel   (905)   King Kelly*   (966)
2  Ken Williams (902)   Mike Tiernan  (963)
3  Chick Hafey  (873)   Gavy Cravath  (957)
4  Tony Oliva   (867)   Sam Thompson  (952)
5  Bing Miller  (866)   John Titus    (952)
6  Carl Furillo (860)   Ralph Kiner   (951)
7  Earl Averill (859)   Hack Wilson   (951)
8  Jeff Heath   (854)   Nelson Cruz   (949)
9  George Kelly (848)   Jack Fournier (947)
10 Hal Trosky   (846)   Cy Williams   (945)

In common: None
If you don’t like King Kelly, who caught a lot, topping Herman’s list, knock out King and add Tim Salmon to the tenth spot with his 947 contemporary sim score. Others nearby: Oyster Burns (946), Darryl Strawberry and Roy Thomas (944), and Ken Williams (943). The traditional matches make Herman seem more unique than he actually is. Their low similarities have much to do with when he played and the fact that fielding is excluded from the traditional similarity formulation. With fielding incorporated into the WAA-component score, Floyd Caves Herman looks remarkably similar to a host of lumbering corner outfielders who could hit but who didn’t hang on long enough to merit discussion among the all-timers. In other words, players just like him.

Gil Hodges

1  Norm Cash         (930)   Wally Joyner  (978)
2  George Foster     (926)   Don Mattingly (971)
3  Tino Martinez     (919)   Joe Judge     (964)
4  Jack Clark        (911)   Ed Konetchy   (963)
5  Mark Teixeira     (901)   Lu Blue       (961)
6  Edwin Encarnacion (900)   Kent Hrbek    (959)
7  Boog Powell       (899)   Dan McGann    (954)
8  Rocky Colavito    (898)   Harry Davis   (953)
9  Joe Adcock        (895)   George Burns2 (950)
10 Lee May           (893)   Jake Daubert  (949)

In common: None
Hodges’ value-based comps surprised me: Nine guys with line-drive power who topped out around thirty homers if they played in the right eras. With two forty-homer seasons and another four in the thirties, Hodges doesn’t seem like that kind of player at all. It’s the ballpark. In 2,657 at-bats at Ebbetts Field before the team moved west, Hodges crushed 172 homers, one every fifteen at-bats. In every other park in the league during the same span, he batted 2,878 times and homered 126 times, one every twenty-three at-bats. That’s a pretty extreme split, more extreme, for example, than Jim Rice (once every twenty at-bats at home; once every 24 times on the road) or Norm Cash (once every fifteen at-bats at home; once every 21 at bats away). Both of them played in parks well suited to their power and handedness just as Hodges did. Away from Ebbets Field, Hodges tended to be a .280/.360/.470 kind of hitter. Don Mattingly’s career slash line was .307/.358/.471. Kent Hrbek’s was .282/.367/.481 (while playing home games at the Homerdome). Wally Joyner’s was .289/.362/.440, and he played in some tough parks. Contemporary similarity scores adjust for home park, which takes the air out of Hodges’ Ebbetts-driven offensive numbers so that his real level of overall offensive production more closely resembles the hitter he was on the road. The trad-stats system gobbles up the home cookin’ so that Hodges looks like he’s not far below the in/out line at first base. In my opinion, he’s another tier down from that.

Bob Johnson

1  Brian Giles     (922)   Vladimir Guerrero*(961)
2  Matt Holliday   (913)   Joe Medwick       (952)
3  Magglio Ordóñez (908)   Joe Kelley        (946)
4  Moises Alou     (907)   Jack Clark        (945)
5  Ellis Burks     (905)   George Foster     (939)
6  Del Ennis       (900)   Moises Alou       (922)
7  Reggie Smith*   (896)   Al Simmons*       (931)
8  Will Clark      (895)   Matt Holliday     (930)
9  Bernie Williams (893)   Jose Canseco      (927)
10 Chuck Klein     (892)   Rocky Colavito    (926)

In common: Alou
Both systems identify Johnson as a borderline Hall candidate. The trad stats treat him a bit more roughly than contemporary sims do. For what it’s worth, Johnson sits just above Medwick and Kelley in my left-field rankings, and Guerrero, his top comp, is on the line in right field.

Bill Mazeroski

1  Frank White     (913)   Frank White       (971)
2  Bill Russell    (883)   Placido Polanco   (942)
3  Leo Cardenas    (872)   Bobby Lowe        (940)
4  Chris Speier    (871)   Hughie Critz      (938)
5  Jim Fregosi     (869)   Lou Bierbauer     (925)
6  Royce Clayton   (868)   Craig Counsell    (923)
7  Tony Taylor     (864)   Fred Pfeffer      (919)
8  Phil Garner     (854)   Bid McPhee*       (918)
9  Terry Pendleton (852)   Mark Ellis        (917)
10 Garry Templeton (851)   Roger Peckinpaugh (916)

In common: White
Frank White just had to show up as Maz’s number one comp in both lists, right? It’s must be a law. If Craig Counsell were right-handed, however, he’d challenge for the top spot, but as a lefty he avoided double plays much more often than Mazeroski did. The traditional counting stats don’t quite know what to make of Maz. They pair him up with shortstops, probably because his hitting was bad enough and his fielding excellent enough that he could get away with hitting like one. The value components see him as similar to great fielding second basemen who couldn’t hit and to one shortstop. Both lists show why Mazeroski doesn’t belong in the Hall of Fame unless you make a gigantic allowance for the nebulous concept of Fame or overinflate the value of his abilities as a pivotman.

Vada Pinson

1  Steve Finley      (909)   Steve Finley      (976)
2  Johnny Damon      (906)   Brett Butler      (937)
3  Roberto Clemente* (869)   Willie McGee      (919)
4  Al Oliver         (864)   Enos Slaughter*   (915)
5  Willie Davis*     (858)   Mickey Rivers     (911)
6  Dave Parker       (847)   Richie Ashburn*   (909)
7  Steve Garvey      (838)   Brady Anderson    (908)
8  Zack Wheat*       (834)   Curtis Granderson (897)
9  Bill Buckner      (833)   Ken Griffey       (896)
10 Torii Hunter      (830)   Bobby Abreu       (895)

In common: Finley
Vada Pinson confuses both systems, and with good reason. Despite 256 homers, Pinson’s offense only adds up to 145 batting runs. That’s fewer than Brett Butler for Pete’s sake. On the bases, he could motor, as his 300 steals and 127 triples attest, and he ran up 28 runs on the bases. Like Butler, he avoided DPs very, very well. He was a slightly below average fielder. You’d never guess looking at all those career homers that it all added up to something that looked like a leadoff man. The trad stats massively overreach on Clemente, and we can chalk that up to their lack of defensive inputs. On the other side, Pinson’s value profile and trajectory are very unusual and result in something of a grab-bag of types of players who only seem to share their athleticism with him. Bill James noted in The Politics of Glory that Pinson is a very difficult player to comp.

Ron Santo

1  Aramis Ramirez (881)   Ron Cey        (926)
2  Scott Rolen*   (877)   Bob Elliott    (909)
3  Dale Murphy    (875)   Vlad Guerrero* (904)
4  Ken Boyer*     (874)   H.R. Baker*    (900)
5  Gary Gaetti    (874)   Bob Johnson1*  (895)
6  Ruben Sierra   (866)   Rocky Colavito (890)
7  Chili Davis    (865)   David Wright   (889)
8  Bobby Bonilla  (864)   Darrell Evans* (886)
9  Brian Downing  (862)   Robinson Cano  (885)
10 Graig Nettles* (860)   Troy Glaus     (884)

In common: None
Remembering that the contemporary approach comes in 25–50 points higher than the trad-stats approach, we see that neither system finds him all that comparable to anyone. For style points, the trad stats come up with Ken Boyer, Scott Rolen, and Graig Nettles, all players Santo clearly beats out in for ranking at third base. The contemporary scores list has Home Run Baker who ranks just a smidgen behind Santo.

Ken Williams

1  Jeff Heath     (926)   Mike “Elmer” Smith (971)
2  Chick Hafey    (919)   Tip O’Neill        (967)
3  Wally Berger   (912)   Jeff Heath         (963)
4  Hack Wilson    (903)   Ross Youngs        (962)
5  Babe Herman    (902)   Bobby Veach*       (961)
6  Hal Trosky     (890)   Charley Jones      (959)
7  Mike Sweeney   (888)   Topsy Hartsel      (958)
8  Pedro Guerrero (888)   Kip Selbach        (957)
9  Bob Meusel     (882)   Wally Berger       (956)
10 Jackie Jensen  (877)   George Gore        (955)

In common: Heath, Berger
As we saw earlier with Babe Herman, the counting stats imply that Williams isn’t all that similar to anybody and mostly to players from his own time (the live-ball era). On the other hand, a WAA-components approach sees him as very, very similar to a lot of players, though they tend to be deadball and Nineteenth Century hitters. Players from the early days tend to have very low variation in their baserunning totals because BBREF uses a regression equation to predict baserunning from stolen base records. Naturally that will draw players closer to the mean and one another, giving them an outward resemblance to Williams’ basically neutral baserunning. Neither Williams’ career nor anyone’s before his has any double-play avoidance value because play-by-play data isn’t yet fully available. If you want some more modern-looking comps for Williams, his next ten matches include Roy Thomas (955); Riggs Stephenson and Joe Kelley (954); Sam Thompson and Mike Tiernan (952); King Kelly and Gavy Cravath (951); Roy Cullenbine, Tommy Holmes, and Chick Hafey (950).

A New Day for Sim Scores
We set out to rescue similarity scores, and we hope you agree that contemporary sim scores deliver. We hope you’ll find our contemporary similarity scores Excel tool helpful and that it gives you more information to discuss great players with. That’s the game here: Not to induct the guys we think deserve it but to induct the guys that the weight of the evidence points to. This is another tool in our kit to help weigh the evidence, and it’s sharper than its predecessor. In the comments, let us know now or in the coming days and weeks what you learn from these sleek, modern new sim scores. Next week, we’ll share the contemporary sim score worksheet for pitchers.

Finally, Bill James leaves off his chapter on similarity scores in The Politics of Glory with a list of popular candidates and their most comparable player. I’ll do the same here with hitters who have recently or soon will come under scrutiny by the BBWAA and the Veterans Committee. Enjoy.

  • Bobby Abreu: Billy Williams* (963)
  • Albert Belle: Frank Howard (951)
  • Carlos Beltran: Cesar Cedeño (906)
  • Adrian Beltre: Scott Rolen* (900)
  • Barry Bonds*: Willie Mays* (841)
  • Ken Boyer*: Sal Bando* (972)
  • Joe Carter: Ruben Sierra (968)
  • Will Clark: Bill Terry* (928)
  • Bill Dahlen*: George Davis* (963)
  • Steve Garvey: Ron Fairley (951)
  • Todd Helton*: John Olerud (945)
  • Matt Holliday: Jose Canseco (956)
  • Andruw Jones*: Scott Rolen* (912)
  • Jeff Kent*: Julio Franco (923)
  • Don Mattingly: Kent Hrbek (973)
  • Joe Mauer: Bill Dickey* (945)
  • Fred McGriff: David Ortiz (950)
  • Mark McGwire*: Edgar Martinez* (939)
  • Minnie Miñoso*: Sherry Magee* (968)
  • Thurman Munson*: Buster Posey (947)
  • Dale Murphy: Al Oliver (973)
  • Tony Oliva: Mike Griffin (965)
  • David Ortiz: Fred McGriff (950)
  • Dave Parker: Chili Davis (931)
  • Manny Ramirez*: Frank Thomas* (920)
  • Alex Rodriguez: Eddie Colllins* (916)
  • Scott Rolen*: Robin Ventura (916)
  • Jimmy Rollins: Bert Campaneris (950)
  • Gary Sheffield*: Sam Crawford* (872)
  • Sammy Sosa*: Dwight Evans* (939)
  • Harry Stovey: Dolph Camilli (960)
  • Ichiro Suzuki: Willie Davis* (913)
  • Chase Utley: Lou Whitaker* (914)
  • Omar Vizquel: Rabbit Maranville (961)
  • Lou Whitaker*: Ryne Sandberg* (965)
  • Maury Wills: Jose Reyes (944)
  • David Wright: Bob Elliott (934)


5 thoughts on “Introducing Contemporary Similarity Scores for Batters

  1. Picture a sold-out auditorium, the audience on their feet, giving you a standing ovation. Bravo, gents. Bravo!

    Posted by BigKlu | July 13, 2020, 7:06 pm
    • Thanks BK. I have to admit, however, that the positional values went all cockeyed on me. I’m restoring them now and they should be corrected by late tomorrow.

      Posted by eric | July 13, 2020, 8:50 pm
      • OK, the problem is fixed! The link at the top should now take you to the corrected version. If you spot any mistakes, follow the airport protocol: If you see something, say something.

        Posted by eric | July 15, 2020, 6:04 pm
  2. I’m looking to do some similarity score comparisons, so I googled: “similarity scores” “positional adjustment”. Your post here came up in the results. Wow! This is so awesome how you generated a more accurate system over Bill James (and BBREF)’s system.

    Posted by Matt Maldre | January 6, 2021, 10:39 am

Tell us what you think!

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Institutional History

%d bloggers like this: