Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

The Bill James Handbook 2020
The Bill James Handbook 2020
The Bill James Handbook 2020
Ebook1,092 pages3 hours

The Bill James Handbook 2020

Rating: 4 out of 5 stars

4/5

()

Read preview

About this ebook

The first-to-market, most comprehensive, insightful, and groundbreaking annual baseball book on the market. A must-have book or gift for every true fan, with lifetime statistics and leader boards for every player in the major leagues and projections for how they might do in the future.
LanguageEnglish
PublisherACTA Sports
Release dateNov 22, 2019
ISBN9780879460266
The Bill James Handbook 2020
Author

Bill James

Bill James made his mark in the 1970s and 1980s with his Baseball Abstracts. He has been tearing down preconceived notions about America’s national pastime ever since. He is currently the Senior Advisor on Baseball Operations for the Boston Red Sox, as well as the author of The Man from the Train. James lives in Lawrence, Kansas, with his wife, Susan McCarthy, and three children.

Read more from Bill James

Related to The Bill James Handbook 2020

Related ebooks

Baseball For You

View More

Related articles

Related categories

Reviews for The Bill James Handbook 2020

Rating: 4 out of 5 stars
4/5

1 rating0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    The Bill James Handbook 2020 - Bill James

    Acknowledgments

    Introduction

    Yet another amazing season of baseball will soon be behind us and written into the record books. A season that was celebrated as the 150th anniversary of Major League Baseball’s first organization, the Cincinnati Red Stockings. For the first time in league history, four teams won at least 100 games—unfortunately for the Reds, they were not one of them (Houston Astros, Los Angeles Dodgers, Minnesota Twins and New York Yankees).

    On September 11, Jonathan Villar of the Baltimore Orioles crushed a 443-foot home run that was the 6,106th league-wide home run, breaking the previous record set way back in 2017. By the end of the regular season, the final record-setting total was 6,776. Although quite productive, that’s a lot of plate appearances without putting the ball in play.

    Additionally, for the 12th straight year, the league set a new total season strikeout record with over 42,000. Now, that is a huge amount of plate appearances without getting the ball in play.

    Attendance at the ballparks continued its decline, dropping another 1.7% from the 69.6 million in 2018 to 68.5 million this season. However, the league played its first-ever games in Europe at London Stadium between the Yankees and the Boston Red Sox, along with a game between the Kansas City Royals and the Detroit Tigers in Omaha, the first MLB game ever played in the state of Nebraska.

    How much will a change of venue really impact a fairly steady decade-long decline? More importantly, what other alternative options may be available? Well, a little later in the Handbook, Bill James is prepared to share with you 50 ways to stop baseball from being swallowed up by home runs and strikeouts.

    Several readers of last year’s Handbook had questions regarding the changes made to the register portion of the Handbook. Specifically, what are the new guidelines used for minor league (MiLB) stat-line inclusion?

    First, all top-25 prospects have their MiLB stats, regardless of whether they played in MLB this past season. That means top prospects, such as Tampa Bay Rays shortstop Wander Franco and Los Angeles Angels outfielder Jo Adell, are in the Handbook register despite not playing in the majors.

    Second, all other top-100 prospects will have their MiLB stats shown provided they played in the majors. If a player graduated from the top-100 list this year, they will have their minor league stats displayed. That means players such as New York Mets first baseman Pete Alonso and Los Angeles Dodgers shortstop Gavin Lux have their minor league history included. If someone graduated from the prospect lists last year or prior, they will not have their MiLB stats shown unless they played in the minors this year. This includes a player such as Atlanta Braves outfielder Ronald Acuna Jr.

    Next, all MLB players will have their 2019 MiLB stats shown, provided they have batter stat lines from least 10 games or pitcher stat lines from at least five games. This prevents short stints in the minors (mostly rehab assignments) from being shown.

    Fourth, a player will never have more than five years of MiLB stats shown in the register.

    Finally, if a player plays at multiple levels below AA in the same year, then those stat lines are combined and shown as the level Low.

    Bill James and the rest of our advisory board all agreed that these guidelines ultimately improve the Handbook, allowing us to add new stats and analysis, along with more written research and opinion pieces by some of the industry’s best minds.

    As an additional resource to you, our valued reader, we have added a free-to-the-public stat area for billjamesonline.com providing over 50 different batting, pitching, team and league profiles. The annual league-level hitter and pitcher analysis that had previously been printed in the book is now available in the same public stat area, along with new individual player search capability.

    We hope that you enjoy all that this year’s Handbook has to offer.

    Rob Dougherty

    Coplay, PA

    October 8, 2019

    Who Would the Public Want To Go Into the Hall of Fame

    Bill James

    This article is not about who deserves to go into the Hall of Fame; I have written about that dozens of times. This is not about who meets the standards of a Hall of Famer, nor is it about whether those standards should be higher or lower or Mr. In-Between. It is not about what the experts think, nor is it a poll of sportswriters. It is not about WHY the public likes one guy and doesn’t like another guy, beyond generalizing about that to learn something from the study. It is simply about who the public would most like see go in.

    Where do you go, to find the public? I went to Twitter. There are other publics. You can go to a public park and ask the people who are hanging out there who they think should be in the Hall of Fame. You can go to Buffalo Wild Wings. You can go to a public beach and cross-examine beautiful women in skimpy bikinis about who they want to get into the Hall of Fame, but I’ve found that you can frequently get arrested doing that; shoot, I’ll probably get arrested just for making a joke about it. There’s Publix Grocery Stores in Florida. Maybe they have them other places, too, I don’t know; I just see them in Florida. I have found that if you ask people at Kauffman Stadium who should be in the Hall of Fame, they talk about Frank White and Dan the Man Quisenberry.

    My point is that there are problems with any public you want to poll, including Twitter. Twitter gets you access to a lot of people, and Twitter has a really neat polling function, which is what I used, granting that Twitter is not an absolutely perfect source to poll the public, but it’s what we’ve got.

    I started by drawing up a list of everyone that I thought would have a reasonable Hall of Fame case, a reasonable dossier to present, Harold-Baines like, to the powers that be. I drew up a list of about 130 names, and then I went to my on-line readers (Bill James Online), and asked them to submit anyone else that they thought should be included. Eventually I got to 156 names. I tried to be all-inclusive, except that I didn’t include anybody like Mike Trout or Ichiro or Pujols or Cabrera who is just TOO obvious; I didn’t include them because if they were in a four-person poll, somebody in the poll would probably get 0%, and I had to mathematically analyze the results, and math hates zeroes. When you are studying ratios, extreme ratios are a pain in the slide rule.

    I didn’t intend to include anyone who didn’t have a legitimate argument to be a Hall of Famer, and I didn’t include anyone for whom I didn’t think there was a legitimate argument that he WASN’T a Hall of Famer. In retrospect, I made some mistakes on both sides. I included some people who I wish that I had not included, because they have almost no public support as candidates, and I included a few people, like Adrian Beltre and Justin Verlander, who are just too obvious; the polling system would have worked better without them. But I didn’t know, until I did the work, how one guy would poll as opposed to another.

    I polled each candidate six times, each time comparing his Hall of Fame support to that of three other candidates. That makes 18 position points for each candidate. With 156 players being polled and four candidates in each poll, there were 39 polls in each round of voting. With six rounds of voting, that makes 234 polls.

    Beginning on June 24, 2019 and ending September 25, I ran 234 Twitter polls, asking people which of these four men they thought was MOST deserving of being in the Hall of Fame. I found that it was virtually impossible to get many people to focus on THAT question rather than the other question that people more often debate, which is WHETHER a particular player is or is not worthy of the Hall of Fame. The usual Hall of Fame debate is framed Is Fred Sluggerewski worthy of being in the Hall of Fame? I was trying to get people NOT to ask that question and not to think about that question. I was trying to explain that this is not about what the standard is or where the line should be or where you think anyone stands with regard to those standards. This is just about who is better qualified than who. Whether you are talking about Willie Mays, Honus Wagner, Hank Aaron and Walter Johnson or Mark Whiten, Mark Loretta, Mark DeRosa and Mark Hendrickson, one of the four must still be better qualified than the others. What I am trying to get you to tell me is, Which one?

    I found that it was basically impossible to get people to address the question I was asking. If a poll had two or three strong candidates in there I would get about 1600 responses, generally, whereas if a poll did not have strong candidates I might struggle to get the 800 that I felt that I needed, because a lot of people wouldn’t vote if they didn’t feel that any of the four candidates was qualified, and also, I’d get 20 responses to every poll telling me that none of these people was qualified and I should have a None of the above option, which, you know…. I didn’t suggest that any of the four WAS qualified, and I really don’t have any interest in how many of you think none of the four was qualified; it doesn’t have ANY relevance to what I was trying to study, which is who should be behind who else in line.

    For the 234 polls I got a total of 289,416 votes, or 1,237 votes per poll. The average player in the average poll got 309 votes, and I was able to get the 800 votes that I felt were needed in all but 19 of the 234 polls, and was usually pretty close in the rest. I appreciate you all voting, if you did. Also, there were the fun people who would write back Tony Oliva when I was trying to poll four other outfielders. I asked people to cut that out, and, when they persisted, I started blocking anybody who did that, just because it was annoying, rude, and non-responsive.

    Each poll was then analyzed by a set of 12 formulas, which had to be hand-entered, so that’s a total of 2,808 formulas that had to be hand-entered. It was a long process and a lot of work, is my point, and I got pretty cranky in there because I don’t usually work that hard. I think it was, by far, the largest and most extensive study ever of who the public WANTS to get into the Hall of Fame. That was my goal, anyway.

    We’ll use Jim Edmonds to illustrate the process. In the first round, which was organized by alphabetical order, Jim Edmonds was polled against Carlos Delgado, Negro League star pitcher John Donaldson, and 1947 National League MVP Bob Elliott. D’s and E’s. In the second poll Edmonds, born 6-27-1970, was polled against Sammy Sosa (born 11-12-1968), Gary Sheffield (11-18-1968), and Juan Gonzalez (10-20-1969). Players were polled in the second round in order of their birth, so that each man was polled against contemporary stars.

    In the third round, Edmonds was polled against three other center fielders—Fred Lynn, Chet Lemon, and Bernie Williams. In the fourth, the goal was to poll candidates against other players from the same franchise, although this didn’t quite work in Edmonds’ case, because the number of players from a franchise doesn’t always neatly divide by four, so you have to do the best you can. In the fourth round Edmonds was polled against Cardinal third baseman Ken Boyer, but also 1920s St. Louis Browns’ outfielder Ken Williams and San Francisco first baseman Will Clark. Clark was actually Edmonds’ teammate at the end of Clark’s career, although that wasn’t why he was in the poll; that just worked out that way.

    In the fifth round players were placed in polls based on their career WAR (Baseball Reference WAR); Edmonds, with 60.4 WAR, was polled against Bobby Abreu (60 WAR), Andy Pettitte (60.2 WAR), and Negro League Star John Beckwith, who has no WAR number, so we penciled him in at 60; he had to go somewhere. In the sixth and final poll, players were placed in groups based on their support in the previous five polls. Edmonds, with a previous Support Score of 124, was placed in a competition with Luis Tiant (121), Kenny Lofton (123), and Dave Parker (125). Also, there was a rule that no two players could be polled against one another more than once.

    The idea was to measure the strength of Jim Edmonds’ support—and every other player’s support—by contrasting the support for each player with the support for 18 other players. Each of the 156 players was initially assigned a Support Score of 100, and then those 100 points were re-assigned, again and again and again, literally hundreds of times, until the data completely stopped moving, so that recalculating the results a thousand more times would not move anyone up or down by a point, nor by a hundredth of a point. For example, in the first poll, Jim Edmonds got 53% of the vote, whereas Carlos Delgado got 34%, John Donaldson 9%, and Bob Elliott got 4%.

    At the start, Jim Edmonds would have 100 points, and Carlos Delgado would have 100 points. 200 points total. Edmonds beat Delgado, 53-34. If you split those 200 points in the ratio of 53 to 34, it works out to 122 points for Edmonds, 78 for Delgado, 122-78 being more or less the same ratio as 53-34.

    Edmonds beat John Donaldson 53-9, so those 200 points are split 171-29, and Edmonds beat Elliott 53-4, so those 200 points are split 186-14. I am showing those numbers as integers, although of course I did not use integers in the calculations.

    As Edmonds is later compared to 15 other players, however, so are Delgado, Donaldson and Elliott. It turns out that Donaldson and Elliott don’t have very much support when they are compared to anybody. Bob Elliott—who was a really fine player in the 1930s and 1940s—got 4% of the vote in the first round, 4% in the second round, 9% in the third round, 4% in the fourth round, 2% in the fifth round, and only 18% in the sixth and final round, when he was being polled against three other players who had performed as badly in previous votes as he had. Elliott had almost no support in the voting, with a final Support Score of 8, with the average being 100.

    By the end of the process, Edmonds’ first-round results are evaluated in this way:

    In other words, Edmonds is stronger than Delgado by a ratio of 53-34, stronger than Donaldson by a ratio of 53-9, and stronger than Elliott by a ratio of 53-4. As a result of that, Edmonds has three position points resulting from that poll, which are 109, 117 and 118. A position point is something that tells us where the player stands in relation to other players.

    In the second round of the voting, Edmonds was matched against much stronger competition—Sammy Sosa, Gary Sheffield and Juan Gonzalez. He finished second in that group, getting 28% to Sheffield’s 39%, with Sosa getting 27% and Juan Gone only 6%. That poll is evaluated in the following way:

    Edmonds is weaker than Sheffield by a ratio of 39-28, but stronger than Sosa by a ratio of 28-27, and stronger than Gonzalez by a ratio of 28-6. His three position points from the second poll are 107, 113, and 128.

    By the end of the process, Edmonds has 18 position points, which are:

    From Poll 1, relative to Carlos Delgado:109

    From Poll 1, relative to John Donaldson:117

    From Poll 1, relative to Bob Elliott:118

    From Poll 2, relative to Gary Sheffield:107

    From Poll 2, relative to Sammy Sosa:113

    From Poll 2, relative to Juan Gonzalez:128

    From Poll 3, relative to Chet Lemon:120

    From Poll 3, relative to Fred Lynn:118

    From Poll 3, relative to Bernie Williams:125

    From Poll 4, relative to Ken Boyer:127

    From Poll 4, relative to Will Clark:126

    From Poll 4, relative to Ken Williams:123

    From Poll 5, relative to Bobby Abreu:125

    From Poll 5, relative to John Beckwith:106

    From Poll 5, relative to Andy Pettitte:126

    From Poll 6, relative to Luis Tiant:118

    From Poll 6, relative to Kenny Lofton:110

    From Poll 6, relative to Dave Parker:104

    The average of these 18 positioning points appears to be 118, but if you save all of the decimals and make a little adjustment at the end of each round of calculations so that the average of all 156 players remains at 100.000, then it works out to 119. Edmonds’ Support Score is 119. I identified these here as polls 1, 2, 3, 4, 5 and 6, which is not the numbers of the polls but the numbers of the voting rounds; these were actually polls 10, 69, 101, 152, 184 and 226.

    Let me try again to explain this. People kept telling me that I needed to include a None of the above option. But that option would address the issue of WHETHER the respondent thought that one of these players should be in the Hall of Fame. I’m not studying whether people think that one of these players or any of these players or all of these players should be in the Hall of Fame. That’s the way that most people frame a similar question most of the time, but I was studying a different question. Understanding that, it is no more logical to include a none of the above option than it would be to include an all of the above option, or a two of the above option, or a three of the above option. Each one of THOSE inputs is exactly as relevant as the others—but the structure of a Twitter poll does not allow you to present ALL of those options, so it would be entirely improper for me to include any of them—quite apart from the fact that that’s not what I am studying. If I had asked that question, I would have been (a) confusing the issue as to what it was that I was studying, and (b) simultaneously introducing a bias into the data—a bias toward artificially low estimates, since the respondent would have had the opportunity to say that NONE of these players should be in the Hall of Fame, but no opportunity to say that two of them or three of them or all four of them should be in.

    The other theme of the study was that after every poll in which somebody’s favorite player from the 1960s or the 1970s didn’t do well, I would be hit with a barrage of complaints about recency bias. Of course there IS some recency bias in a Twitter poll, but (a) many of the respondents grossly exaggerated how large that bias was, and (b) my point here is that I was studying WHO the public most wanted to see go into the Hall of Fame—not WHY they wanted them to go in.

    Every form of information may be represented as a bias. If Cardinal fans vote for Jim Edmonds and Yankee fans vote for Bernie Williams, that may be represented as a competition of biases in favor of the Cardinals or the Yankees. When I poll a center fielder against a second baseman, you may have a bias for fielding or a bias for hitting or a bias for baserunning. If you rely on WAR numbers to decide who you would support, that may be a bias toward WAR. We are not testing pure knowledge; we are testing biases.

    The theory of the study was to change the alignments repeatedly, pitting players against random players (alphabetically), against their contemporaries, against players from the same team, against players who played the same position, etc., so that the biases are rotated and the support for each player is tested in competition with multiple biases. I’ll return to this subject in a moment.

    I compare the process of measuring the Support for each player to estimating the number of gallons of water in a lake. You can’t measure the size of a lake simply by measuring how wide it is and how long it is and how deep it is. If a lake is 20 miles wide at the widest spot, it could have an average width of 18 miles, or an average width of 3 miles. If a lake is 100 feet deep at the deepest spot, it could have an average depth of 95 feet, or of 10 feet. In order to estimate the size of the lake, you have to take many, many different measurements, measuring from one shore to the other at many different points.

    This problem is like that. A player may have very wide support, but very shallow support—meaning that a lot of people like this guy, but that his support disappears when he is compared to other strong candidates—or he could have narrow but deep support, meaning that only a small number of people will vote for him but they will vote for him no matter who else is in the group. To really see how much support the player has, you have to measure it again and again, comparing one player to as many other players as you realistically can.

    It doesn’t exactly matter whether a player wins the poll or not. You can win a poll but get a number that hurts your overall score, if you don’t meet your expectations for the poll, or you can finish last in a poll against strong competition but still get position points from that poll that help you overall. But there were 234 polls, of which:

    54 were won by the player who was listed first on the ballot,

    53 were won by the player who was listed second,

    60 were won by the player who was listed third, and

    67 were won by the player who was listed fourth.

    213 were won by the player who had the highest overall Support Score at the end of the process,

    126 were won by the player who had the highest WAR, and

    110 were won by the most recent player of the four.

    The 213 figure is misleading, because the outcome of the poll is an input into the Support Score. There were a substantial number of cases—I believe about 25—in which the player who won the poll also had the highest Support Score, but only because he won that poll; if we ignored the results of that poll, he would not have had the highest Support Score. So the number there, 213, should be more like 185 or 190. To figure out exactly how many times that happened would be a huge project.

    A little over half of the polls were won by the player who had the highest WAR, both because many people rely on WAR to decide who they like best, and because WAR correlates with other forms of evaluation. Even if no one had access to the WAR calculations except the man who does them, WAR would still tend to indicate who would win the poll.

    I identified the most recent player as the player who has played in the majors in the most recent season or, if two players retired in the same season, as the player who was born most recently. 110 of the 234 polls were won by the most recent player, whereas, at random, this number would be 58 or 59.

    Some of that discrepancy is explained by a recency bias on the part of the voters—but by no means all of it. The Most Recent Players won many of the polls because, in a disproportionate number of the polls, the most recent player actually was the best candidate in the group.

    There are two reasons for that. The larger reason is that, in looking for candidates from the 1930s, the 1950s, the 1960s or 1970s, what I had to choose from was the ones who have been left out—the Babe Hermans and Bucky Walterses and Bob Johnsons, from the 1930s, and the Ken Boyers and Rusty Staubs and Jim Kaats from the 1960s. Those guys were tremendous players, but they’re not the top players from their generation; they’re not the Bob Gibsons and Sandy Koufaxes and Joe DiMaggios and Jimmie Foxxes and Roberto Clementes. Those guys have already been taken off the table.

    The most recent generations of players are much less thoroughly picked over by repeated elections—thus, they tend to be stronger candidates. If Todd Helton wins a poll over Gil Hodges, old-timers will scream about recency bias—but Todd Helton, by any reasonable standard, was a greater player than Gil Hodges. If Scott Rolen wins a poll over Bill Madlock, someone will yell about recency bias—but Scott Rolen was twice the player that Bill Madlock was, honestly.

    The second reason that this happened was that, in the second round of voting, when players were paired with other players based on when they were born, the BEST player would almost always outlast the others who were born at the same time by at least a year or two—thus, the best player would also be the most recent player.

    The most recent player won the poll 110 times—but the most recent player also had the highest WAR of any of the four players 84 times. Again, at random, that would only have been 58, 59 times. There were a remarkable number of times in the voting when the most recent player was also the player in the group with the highest WAR, and yet that player did not win the poll. In Poll #16, Todd Helton was both the most recent player in the group and the player with the highest WAR, but he did not win the poll. In Poll #50, Vada Pinson was both the most recent player in the group and the player with the highest WAR, but he finished a distant third in the voting, getting only 16% of the vote. In Poll #83, Keith Hernandez was both the most recent player and the player with the most WAR, but he finished third in the poll.

    In both polls 88 and 89, the four candidates finished in exactly reverse order of the recency bias, with the most recent player drawing the least support, and the least recent player drawing the most support. In Polls #119 and #187, Andruw Jones was both the most recent player and the player with the most WAR, but did not win the poll either time. In Polls #138 and #171, Torii Hunter both times was both the player with the most WAR and the most recent player, but did not win the poll either time. In Poll #140, a poll of Yankee players, Ron Guidry was both the most recent player and the player with the highest WAR, but did not win the poll. In Poll #146, Scott Rolen was both the most recent player and the player with the highest WAR, and also was the player with the highest Support Score overall, but somehow did not win that poll, losing to Mark McGwire. In Poll #148, Chase Utley was both

    Enjoying the preview?
    Page 1 of 1