Thursday, June 14, 2012

Advanced Stat All-Stars

The All-Star break is about a month away in baseball, and the season for voting for All-Stars has begun. Because fans do not have perfect memories, and what memories they have are bound to be selective and inaccurate, statistics are probably the best way to determine which players should be All-Stars and which players shouldn't. But what stats should we use? Since the advent of the sabermetric revolution in baseball, we understand that stats like batting average, RBI, and wins are much, MUCH less relevant in evaluating a player's performance than on base percentage, slugging percentage, and ERA+. And real baseball nerds know the value of stats like BABIP, ISO, and xFIP. But which stats should be used in evaluating All-Stars? And more generally, what do these stats mean? Not every stat should be used to determine everything. Here is my attempt to put some stats in context, and to evaluate who, at this moment, should be the baseball All-Stars.

The most traditional three statistics used to evaluate a hitter's performance are the Triple Crown categories: batting average, home runs, and RBI. To the old-time baseball fan, this is all that was important to evaluate a hitter: how often the guy got a hit, how much power he had, and how many runs he produced for the team. For many reasons, chronicled by Bill James and his followers, these stats are not the most important to consider, especially batting average and RBI. The important non-weighted stats include on base percentage and slugging percentage. When evaluating how much a player contributes to your team's chances of winning, if you want to look at valuable "counting" stats, you should look at OBP and slugging. There has been an unbelievable amount of ink spilled by thoughtful people about why that's the case.

There are also more advanced stats: stats that adjust for ballpark effects, differences in fielders' abilities, and the effectiveness of the outcome of each pate appearance. There are stats that measure specific things like power, and stats that try to be holistic. There are so many baseball stats that measure so many different things, it's sort of hard to figure out what's important in evaluating what. The baseball blog has a glossary of all the stats that laymen like us would ever want to use, and it is a great resource for these stats and evaluating their meaning. But generally, stats all do one thing: they evaluate past performance. But there is something that many stats do that some do not: they predict future performance.

I will make an argument for a few dynamite stats that I use in evaluating a hitter's performance. The best stat there is, in my opinion, is Weighted On Base Average, or wOBA. You can read about it more, but it essentially measures how much each way a player can get on base (a walk, HBP, single, double, triple, and home run, as well as stolen bases and caught stealing) contributes to that player's team scoring runs. So for instance, for the 2011 season, the average walk created 0.69 runs, the average triple created 1.60 runs, etc. So the amount a player got on base as well as the effectiveness of the way in which a player got on base is measured by wOBA. OPS (on base percentage plus slugging percentage) attempts to do this, but it assumes that OBP and slugging are equal, which they are not.

To me, that's the best stat. I do look at a couple of other stats especially favorably, however. One is WAR (Wins Above Replacement Player). Different outlets use different formulas for calculating WAR, but I find that FanGraphs does it in a way that makes the most sense to me. This stat attempts to take into account not only a player's performance offensively, but also defensively and on the bases. It also places value on certain positions. For instance, since there are not that many good hitting catchers in the league, but there are many good hitting first basemen, a catcher with the same stats as a first baseman will have a higher WAR. This is a very inexact science, but the spirit behind it is dead-on, and the calculations do make sense to an idiot like me (maybe not a good thing).

And another stat I look at is home runs. Though this may seem antiquated, home runs mean two things to me. Firstly, it means that this was the number of times the player did the best thing that he can do in any given plate appearance. That's interesting to know. And second, a guy who is a home run threat will tend to make pitchers pitch around him more, and thus make him more dangerous in more game situations. That's sort of my attempt to account for things that stats don't necessarily measure (or at least not yet, or not precisely).

Another very interesting stat is BABIP, or Batting Average on Balls In Play. The important thing to know about BABIP is that it is the most prominent stat that evaluates the future as much as it does the past. Very roughly, the average BABIP for a hitter or for a pitcher is .300. If a player has a BABIP of .400 over a period, he probably got a lucky. And if a pitcher has a BABIP of .400 over a period, he probably got unlucky. As much as sabermetricians don't want to admit it, this is a predictor of future performance.

It's not to say that if someone's average BABIP is .300 for his career, and he has a .200 BABIP in the first half of this season, that he's likely to have a .400 BABIP for the second half of the season. It's mistaken to look at it that way. What a .200 BABIP over half a season does show, however, is that a player's stats in the future are very likely to outperform the statistics he got with that .200 BABIP. The player is worse now than he is likely to be later.

So back to our discussion of All-Stars. Which of these stats should be used in determining who is an All-Star? First, let me give my criteria for evaluating players to see if they should be All-Stars.

1. The players that help their team win the most should be All-Stars. This may seem obvious, but consider a player like Ichiro Suzuki. He is amazingly talented at what he does at the plate (or at least he used to be), but he rarely had the value of a less exciting player like Manny Ramirez or Hideki Matsui. If not for his stellar defense, Ichiro would not be a super valuable player. So I'm going for value over awe-inspiring.

2. I don't care how "lucky" he got. All-Star selection measures performance, not what performance should have been. If a guy has a .400 BABIP and his performance likely to take a nosedive as time goes on, I don't care. Being an All-Star is about honoring what you've done so far in the season, not how much a smart team should pay you in your next contract. Some may say that a player with a high BABIP's stats are misleading. They're not misleading; they're just unsustainable. They're not misleading because they happened. What happened is much more compelling than what should have happened, and what is likely to happen later is irrelevant to this exercise.

So, armed with our important stats, and knowing what's important, here are the All-Star teams. By the way, I'm only doing hitters (pitchers is a whole other discussion), and I'm doing two at each position (for the outfielders, I'll do one at each outfield position and then the next 3 regardless of position). I understand that guys who don't play one position specifically (like guys who DH half the time or something) will be shafted by this metric. So I'll just throw two or three of those guys in at the end. Fine.

American League:

Catcher: AJ Pierzynski (even though I hate him) and Matt Wieters

First Base: Paul Konerko, Prince Fielder

Second Base: Robinson Cano, Jason Kipnis

Third Base: Miguel Cabrera, Mike Moustakas

Shortstop: Asdrubal Cabrera, Elvis Andrus (screw you Jeter)

Left Field: Josh Willingham

Center Field: Adam Jones

Right Field: Jose Bautista

Other Outfield: Josh Hamilton, Mark Trumbo, Matt Joyce

Designated Hitter (the game is in an AL ballpark, KC): David Ortiz, Adam Dunn

Others: Joe Mauer, Mike Napoli, Edwin Encarnacion

National League:

Catcher: Carlos Ruiz, A.J. Ellis

First Base: Joey Votto, Bryan LaHair (weird)

Second Base: Dan Uggla, Jose Altuve

Third Base: David Wright, Chase Headley. IT'S HEADLEYYYYYYY

Shortstop: Mark Melancon Jed Lowrie, Troy Tulowitzki

Left Field: Carlos Gonzalez

Center Field: Andrew McCutchen

Right Field: Carlos Beltran

Other Outfield: Ryan Braun, Giancarlo Stanton, actually Melky Cabrera

Others: Martin Prado, Michael Bourn, Matt Kemp

The squads will be bigger than this, so I would certainly name more guys to the real All-Star team. But there you have it. And guys like Melky Cabrera and Bryan LaHair are perfect examples of guys whom we can predict will nosedive in the second half of the season. Their BABIPs are .408 and .398 respectively, 3rd and 4th in the NL and just waiting to collapse. Consistently great hitters, however, are also in this top echelon of BABIP. Joey Votto's BABIP leads the majors at .428 (GODDAMN), while Paul Konerko and David Wright are in the top six.

But there are your All-Star teams! That wasn't so hard, was it?


  1. What about Mike Trout? He's third in WAR despite not even having enough at bats to qualify.

  2. Good point. Once he has enough at bats to qualify he'll definitely be in there. You gotta play enough though. I understand that the fact that his WAR is so high is crazy given those facts, but you gotta play the games.
    This, of course, begs the question as to why I list Matt Kemp at this time. I put him because of his importance to the league. I suppose this argument could be made for Bryce Harper as well, and that doesn't fit into my criteria for picking All-Stars. But I think it is a little bit important, especially because he was a breakout star last year and was the best player in the NL before his injury. Certainly kind of arbitrary, but perhaps the criteria could use some tweaking.

    1. Something this article doesn't touch upon, which makes sense, but I think is worth managing is the fact that this game has actual value to each league. There is a strategy for taking a guy like Matt Kemp, if healthy, even if his counting stats aren't as high as a guy like McCutcheon because he's a better player. Additionally, you could make an argument for a pinch runner type (Borne or even someone like Dee Gordon comes to mind)and even a lefty reliever who specializes against lefties. Are these guys the best players or who the fans want to see? Probably not. But if your team is playing at home in game 7 because Eric O'Flaherty struck out Robinson Cano in the 8th inning of the All-star game are you gonna complain?

    2. As you said, there is a reason I didn't deal with that in the article. I wanted to keep it very, very short. Oh. But there has been a tension since the All-Star game determined home field advantage in the World Series between picking the best players and making a well-balanced team. I personally am on the side of picking the best players and showcasing them, because that's really what the spirit of the All-Star game is. If choosing Dee Gordon means leaving off 1 very deserving player, I would be against that. The point of the All-Star game is, ultimately, to show everyone who the stars are. While the game does mean something, I think relatively little is lost by having all the best players, rather than having a well-balanced team. Not to say that having a well-balanced team isn't valid, but in my personal opinion, that's not what I'm after.

      P.S. It's spelled Bourn. He's on your team.