Archive for September 2009

The Greatest Team Ever


Any discussion of greatest teams is necessarily subjective.  As many objective measures are brought into the debate as possible, but there is little consensus on what is the appropriate measure of absolute greatness.  Yesterday the Yankees won their 102nd game, tying them for the 11th winningest team in Yankee history.  If they won out, which if unlikely given the rest they will be giving to their best players, 106, tied for 5th in Yankee history with the team that I think is the best ever to play the game, the 1939 Yankees.

The 1939 Yankees went 106-45 in 151 games (losing three to cancellation.)  106 wins ties them for 15th all time, and the .702 winning percentage puts them in 10th place.  However, half of the teams above them did not win the World Series, the 1906 Cubs, 1931 Athletics, 1954 Indians, and 2001 Mariners.  (The 1902 Pirates played before the first World Series, which they lost to the Red Sox the next season.)  That narrows consideration to the 1909 Pirates, the 1927 Yankees, the 1907 Cubs, and the 1998 Yankees.  The Cubs led the league in wins, ERA, and shutouts.  They were no more than an average hitting team.  To claim status as the best team ever, you need to excel in both areas.  The Pirates have the opposite problem.  They led the league in almost all hitting categories but trailed well behind the Cubs in pitching.  In fact, they only won the league by 6.5 games over the two-time defending champion Cubs.  They were certainly an outstanding team but not the greatest.

The 1998 Yankees have the third most victories in a single season.  The pitching staff lead the league in ERA, complete games, and shutouts, while the hitters lead in runs, RBI’s, and walks.  In both cases you see superiority, but not total dominance.  The fielded two regulars who will be Hall of Famers, Jeter and Rivera, and one other possibility in Tim Raines.  They also cruised through the postseason, 3-0, 4-2, and 4-0.  While this is an outstanding team, it is hard to call them the greatest ever.  They did not dominate as completely as the two ahead of them.

The 1927 Yankees are probably called the greatest ever more often than any other team.  They won the league by 19 games, slightly above the 17 game lead of the 1939 team and slightly behind the 22 game division lead of the 1998 team.  They lead the league in all major hitting categories except doubles and stolen bases, while also leading the league in shutouts and ERA.  They fielded six Hall of Famers, Ruth Gehrig, Lazzeri, Combs, Pennock, and Hoyt.  They swept the World Series.  The 1939 Yankees were the crowing achievement of one of baseball great dynasties, the 1936-1939 Yankees.  They fielded five Hall of Famers, DiMaggio, Dickey, Ruffing, Gomez, and Gehrig, though they lost Gehrig after 8 games.  They led the league in runs, RBI’s, slugging percentage, home runs, saves, complete games, shutouts, and ERA.  Their hitting was not as good as the 1927 team.  Ruth and Gehrig will do that to you.  Their pitching, in contrast, was substantially better than either the 1927 or 1998 versions.  The 1939 Yankees had a run differential (runs scored – runs allowed) of +411, better than 1927’s +376 or 1998’s +309, primarily because they gave up less runs (1927 – 599, 1939 – 556, 1998 – 656) than either of their competitors without being outscored. (The 1927 Yankees scored 975 runs, 1939 – 967, 1998 – 965)

None of these teams would be a bad pick as the best team ever.  However, I think that superior pitching gives the 1939 Yankees an ever so slight edge.


Introducing Statistics: On-Base Plus Slugging


Hitting can be broken down into two basic components:  the ability to get on base and the ability to hit for power.  These have been measured in different ways for years.  The oldest measure of hitting is the batting average, which is the rate at which a batter gets a hit (BA=Hits/At Bats).  It ignores times at plate that result in walks or hit batsmen.  Though either outcome involves a batter getting on base, it is not done through a hit and thus is ignored.  In 1876, the first year of the National League, walks were recorded as outs and thus depressed batting average.  Batting average, it appears, has three major flaws:  It measures the rate of an occurence, meaning that a batting average of .333 is the same if we are talking about 1 hit in 3 at bats or 33 in 99.  It ignores other opportunities to get on-base.  It also has no measure of how good a hit occurs.

On-Base Percentage is an attempt to correct for the second of these errors.  OBP=(Hits+Walks+Hit Batsmen)/(At Bats+Walks+Hit Batsmen+Sacrifices+Sacrifice Flies).  This formula brings all plate appearances into the equation (in the denominator) instead of an arbitrarily limited number of at bats.  It also treats all times that a batter gets on base as valuable.  (It does ignore if a batter reaches on an error or on a fielder’s choice, probaby because those are not considered products of the hitter’s skill.)   However, it waits all times on base equally, which seems intuitely mistaken.  A double is better than a walk for two reasons:  It gets a batter one base further, and it has a better chance of advancing preceding baserunners.  Nevertheless, OBP does correct one flaw of batting average.

Slugging percentage attempts to correct batting average’s third flaw.  SLG=(Total Bases/At Bats).  The denominator is the same as battign average, but the numerator is total bases, i.e. how many bases a batter gets from all of his hits combined.  Slugging percentage ignores other ways to get on base, focusing solely on the power of hits.  It has the advantage of treating a home run as more valuable than a single, but it still ignores the importance of getting on base without a hit.  To solve the flaws of OBP and SLG, they are added together to create OPS.

OPS=OBP+SLG.  This stat is probably the most visible sabermetric batting statistic.  It creates a scale that gives weight to all times on base and to the value of hits beyond singles.  However, it has the weakness of all rate stats.  It measures how often a batter does its given elements, and not how many.  While it might be useful to know that a batter reaches base 40% of the time, it also helps to know how many times he reaches base.  An OPS of 1.000 is outstanding, but and OPS of .990 in twice as many plate appearances is more valuable to a team.  Once again, health is important in measuring skill.  It is not sufficient to make an OPS of .500 as valuable as an OPS of 1.000 if the lesser player is twice as healthy.  But it is incredibly important in deciding tough cases.

Sabathia’s Big Push


How much should late season performance factor into award voting?  On Saturday, C.C. Sabathia threw a seven-inning one-hitter to defeat the Red Sox and push the Yankees magic number to clinch the division to 1.  Will this last bit of dominance push Sabathia to the front of the Cy Young voting?  In the Cy Young Predictor, developed by Bill James and Rob Neyer, puts Zack Greinke slightly ahead of Sabathia after Greinke’s 1-run shutdown of the Twins on Sunday.  If Sabathia wins one more game, he will be the only pitcher to reach 20 in the American League.

One important comparison would be Chipper Jones’ MVP in 1999.  In the first three months of the season, Jones hit 14 homers with a .291 batting average and 44 RBIs.  Over the last three months, in slightly less at bats, he hit 31 homers, 65 RBIs, and a .349 batting average.  Jones had an exceptional last three months of the season.  He also had an outstanding year against the Mets, the chief rival of the Braves that season, hitting .400 with 7 home runs and 16 RBIs in 12 games, including dominating the Mets in September.  That same season, Larry Walker led the league in batting average, slugging percentage, and on-base percentage.  Walker, of course, played for the Colorado Rockies, and his numbers were subsequently discounted.  He hit .461 at home and only .286 on the road.  Clearly, factors beyond simply on-field performance are important, including when and where you do what you do.

If you look at the numbers, Jones, like Sabathia, had a very good year.  His performance down the stretch, though, was certainly essential as he won the MVP nearly unanimously over Jeff Bagwell, Matt Williams, Greg Vaughan, and Mark McGwire.  Could the same happen to Sabathia?  His September ERA is 1.29, by far his lowest of any month.  He is also 4-0 with 35 strikeouts.  Greinke, of course, has an 0.55 ERA with 35 strikeouts and a 3-0 record in September, but he has done it with less fanfare.  My prediction?  If Sabathia wins 20, he will win the Cy Young award with voters referencing his exceptional performance down the stretch and how sad it was that he did not win last year in Milwaukee.  I would still support Greinke, but I would not be surprised to see him lose.

The Importance of Forgetting


Why is Mariano Rivera such a great closer?  He has no memory.  Few people in baseball history can have a finger pointed at them and people say, “You lost the World Series.”  Rivera is one of those unfortunate few.  He was already a great closer prior to the 2001 World Series, having set the record for consecutive scoreless innings in the postseason with 34 1/3.  Yet he blew the save in the 9th inning of game 7, coming in with a 2-1 lead and leaving after taking a 3-2 loss.  He gave 3 hits, committed an error, and hit one batter.  It was a truly exceptional meltdown.  How did he respond?  By compiling 310 saves and a 1.95 ERA in the next 8 seasons, from ages 32-39 (i.e. after his prime should have ended).

It would have been easy for the 2001 Series to end his career.  The obvious comparison is Donnie Moore, who never recovered from his one bad pitch in the 1986 ALCS.  Rivera has since 2001 had good games and bad.  In 38 1/3 postseason inning since, he has 3 total runs.  2001 is a distant memory, and his career has rolled with only an injury-induced hiccup in the 2002 regular season.

This lesson applies across sports.  After Ernest Byner’s famous fumble with the Cleveland Browns in 1987, he went on to win a Super Bowl with the Redskins and make the 1990 and 1991 Pro Bowls. Given the weakness of the Bills pass defense and the strength of the Saints passing game, that is what they need from Leodis McKelvin. His fumble in Week 1 was critical to that loss. However, he must follow the examples of Mariano Rivera and Ernest Byner, putting that fumble behind him to be a productive member of a taxed Bills secondary.

NL Cy Young Race


I have been focusing primary on the Cy Young race in the American League up to this point, basically because I am an American League guy at heart.  However, I think the NL race is much tighter and subsequently more interesting and worthy of discussion.  The race, at this point, seems to have three viable candidates and a host of other good pitchers that will be unfortunately lost.  To start with those left behind, Javier Vazquez, Dan Haren, and Josh Johnson are all having excellent years that are being lost in the discussion.  Vazquez and Haren rank 2nd and 3rd in strikeouts and are both in the top 10 in ERA.  By advanced stats, Vazquez is second in FIP and third in WHIP, while Haren leads the league in WHIP.  If the Braves make the playoffs, Vazquez might end up part of the vote for the Cy Young.  At the moment, he would be third on my ballot.  But let us turn to three contenders getting the most attention, Tim Lincecum, Chris Carpenter, and Adam Wainwright. 

Linceum is leading the league in strikeouts, K/9, FIP, WAR (Wins above replacement, not likelihood to survive a 1-year tour in Iraq), etc.  He has the misfortune of playing for the offensively putrid Giants, which has hurt his won-loss record, nevertheless it is still a respectable 14-7.  However, that is the worst of the three major candidates.  He is also second in ERA

Carpenter has the best storyline, coming back from major surgery that caused him to miss all of the 2007 and most of the 2008 season after winning the Cy Young in 2006.  He is leading the league in ERA, third in fewest walks/9, second in fewest HR/9, third in FIP, and has a sterling 16-4 won-loss record, allowing him to lead the league in winning percentage. 

Wainwright leads the league in wins with 18, is third in ERA (behind only Carpenter and Lincecum), 5th in strikeouts, and leads in inning pitched.  That last stat is important, because it means the Cardinals have the advantage of ignoring their bullpen when Wainwright is pitching more often than any other team with any other pitcher in the National League. 

So given these three cases, how should the vote turn out?  Carpenter’s story, though inspiring, should be ignored.  His ERA is a bigger mark in his favor, but he has pitched substantially less innings than any of the other candidates.  Though he has probably been the best pitcher in the National League since he has returned from the disabled list, it is by a small enough margin that his time on the DL outweighs his later contributions.  Health matters.  Despite the fact that Lincecum pitches for a weaker hitting team, Wainwright ranks higher in tough losses (losses in games in which a pitcher went at least six innings and gave up 3 or less runs).  However, in the end I think that Lincecum’s lead in strikeouts overcome his deficit in wins.  But I could easily be talked out of that choice. 

My Ballot:

1.   Tim Lincecum

2.   Adam Wainwright

3.  Javier Vazquez

AL Cy Young Wrap-Up


The Cy Young ballot is unusual among baseball awards.  You only get to vote for three players.  The MVP ballot and the Hall of Fame ballot both run 10 players deep, but for some reason the Cy Young is limited to three.  I think five pitchers have a legitimate shot to be part of that 3 on the AL side:  Zack Greinke, CC Sabathia, Felix Hernandez, Justin Verlander, and Mariano Rivera.  Scott Feldman had a chance, but his last start killed his ERA and allowed Hernandez to pass him for the league-lead in winning percentage and tie for second in the league in wins.  He should no longer get a vote.

Sabathia leads the league in wins, but I think he is the weakest candidate of the five.  That does not mean that he won’t win the award in November, but he should not.  Sabathia’s ERA is slightly lower than Verlander’s, who has the highest of the five candidates listed above, however he has done it with 70 less strikeouts for a notably superior team.  Verlander leads the league K/9 and is second in FIP, the stat introduced yesterday.  He is hurt by playing in front of a poor fielding team in Detroit.  Greinke leads the league in FIP, ERA, fewest hits, complete games, HR/9, etc. He is having the most dominant season in the majors. His only problem is a lack of wins. He has won 15 to Sabathia’s 18, and Verlander and Hernandez’s 17. That is what pitching for Kansas City will do to you.

That leaves Rivera. I have already argued that Rivera might win the Cy Young. It is difficult to predict when a reliever will pick up a Cy Young, but they do it with regularity. Rivera is the best closer in the American League this year. If relievers are added in, Rivera suddenly leads the league in ERA and pulls ahead of Greinke in K/9, BB/9, and K/BB. However, he has only pitched 62 innings, fewer than any reliever to have won a Cy Young. I don’t think he will or should win, but it would not be a travesty if he nudges past Greinke for the award.

My ballot:
1. Zack Greinke
2. Mariano Rivera
3. Justin Verlander

Introducing Statistics: Fielding Independent Pitching


I would like to introduce an advanced stat today.  Given all of the award talk and Hall of Fame talk in the last week on this site, it seems useful to introduce statistics by which players can be evaluated.  The stat for today, fielding independent pitching (FIP) is one such stat used to isolate the performance of pitchers.  It starts from a very simple problem and then attempts to resolve it as simply as possible.  How can pitching and fielding be separated?

The statistician Voros McCracken began work on this problem in the late 1990’s.  Start with this observation:  it was easier to pitch for the St. Louis Cardinals in the 1980’s with Ozzie Smith behind you than to pitch for the Cardinals in 2000 when supported by Edgar Renteria.  This has nothing to do with the inherent talent of the pitcher and is instead completely dependent on the talent of his shortstop.  The influence of parks on pitchers and hitters has been studied for a long time, and park factors were developed in an attempt to isolate them.  The impact of fielding, though always known intuitively, has gotten its best study only recently.  McCracken argued that pitchers have minimal control over the percentage of balls hit into play that are turned into outs.  Instead, pitchers seem to control 5 factors, walks, strikeouts, home runs, hit batsmen, and intentional walks.  These factors correlate highly from year to year, not changing dramatically if a pitcher switches teams or if a team switches important defenders.  In contrast, hits vary wildly depending on factors outside a pitcher’s control, which has the corollary that so does earned run average.  To account for this McCracken created a stat called defense independent ERA.

If you click through the link, you see that dERA is brutally complicated. Because of this, two other sabermetricians developed simpler formulas for what is now called FIP, Clay Dreslough and Tom Tango. (FIP is the name Tango gave his stat; Dreslough nearly identical stat was called DICE.) These two authors simplified the formula to this:

FIP={13HR + 3BB – 2K}/IP

Since this original formula, a 3.2 has been added to the end to convert this number into something that more nearly resemble ERA.  The second term in the numerator has also been updated to 3(BB+HBP).

What does FIP tell you about this season?  Zack Greinke and Tim Lincecum are the best pitchers in baseball.  Incorporating relievers changes this, pushing Jonathan Broxton and Brian Wilson to the forefront.  As this should tell you, FIP relies heavily on strikeout rates and will always push high strikeout pitchers to the leaderboard.  Groundball pitchers, in turn, are much more dependent on having reliable defense behind them.