Forecasting Major League performance from minor league statistics is relatively common today, but what about taking the next step? How predictive are college offensive statistics? What is the best way to measure? Since this past summer, I’ve been tinkering with a comprehensive system for ranking college performance and projecting pro success – do I know how to party, or what?
I took every college position player drafted in the top 50 overall selections from 2001 to 2005 – before that college stats are hard to find and after that there isn’t enough major league data – fed them into my system and then determined the correlation between various metrics and Major League career OPS. For players who have not reached the majors, I used their most recent minor league season (min 200 ABs) to determine equivalent, park-neutral OPS. The result was 50 hitters with data from 117 player seasons and over 30,000 at bats.
The basic idea behind my system is relatively simple: Different types of players develop in different ways. The system is based on the idea that players of different skill sets will develop in different ways. Focusing on year-to-year trends, I quantify each player’s skill set and then determine how likely that type of player is to succeed at the highest level. I’ve used past stats from drafted players to fine tune my system to better indicate future successes or failures.
For example, as important as power, strikeout, and walk rates are individually they are more important when judged together. Let’s look at former Billy Beane man crush Nick Swisher. Swisher had strikeout rates of 16%, 18% and 14% from 2000-2002 at The Ohio State University. Normally those relatively high percentages would result in a poor rating, but my system liked that Swisher showed really good power and an off-the-charts walk rate. Compare him to a guy who was generally thought of as a similar talent before the draft like Josh Fields, despite not being named in a bestselling book. While my system gave Swisher (512) a high rating, it gave Fields (403) a poor rating – anything over 480 is good – even though Fields had basically the same strike out rates (17%, 17%, 15%). The problem is that Fields didn’t walk a ton, or hit for much power. He just wasn’t productive. Swisher had a .512 wOBA* as a junior while Fields only managed a .361 wOBA*.
My system is comprehensive, relying on empirical production metrics. It looks at a player’s total body of work adjusting for age, positional, park, conference, and schedule effects, and then uses a system of variable weights to determine a player’s likelihood of pro success based on what players of comparable skill have done in the past. I weigh a players overall production, power, zone judgment, contact ability, speed and base running ability and I do so based on multi-season data.
2009 Draft Class
It’s way too early to give final grades on players, but my system indicates that performance as an underclassman is very important. Here are the top 31 draft eligible players for the 2009 draft based on 2008 performance:
|AJ Pollock||CF||Notre Dame||524.13||21.0||.360||442.0||9.60%||4.00%||432.72|
|Jason Kipnis||CF||Arizona State||514.16||21.7||.408||705.1||16.67%||13.40%||322.50|
|Matt den Dekker||CF||U of Florida||489.16||n/a||.387||441.6||11.69%||10.33%||282.98|
|Marc Krauss||3B/OF||U of Ohio||484.07||21.9||.411||657.0||17.20%||11.10%||108.30|
|Jason Stidham||2B||Florida St.||480.74||20.8||.405||509.1||13.69%||9.55%||120.66|
|Josh Fellhauer||OF||CS Fullerton||446.01||20.7||.400||526.4||6.42%||12.16%||313.93|
|Jordan Henry||CF||Ole Miss||436.56||20.5||.338||250.0||10.50%||7.00%||243.35|
|Ryan Ortiz||C||Oregon State||432.56||n/a||.480||509.3||13.06%||13.06%||52.95|
|Willy Fox||OF||Wake Forest||425.40||22.2||.384||445.6||8.98%||12.11%||279.67|
|Ryan Jones||OF||Wichita State||402.17||20.6||.376||314.7||10.14%||13.85%||263.80|
From 2001 to 2005, only eight players have scored over 500 as sophomores in my system. Six of those eight (Jed Lowrie 530, Ryan Bruan 523, Alex Gordon 581, Conor Jackson 547, Aaron Hill 504, and Nick Swisher 542) have generally met or exceeded expectations while just two of the eight (John Mayberry Jr 528, and Jake Gautreau 507) have not. Both Mayberry and Gautreau had poor junior seasons that knocked their overall ratings down. This bodes well for Josh Phegley, AJ Pollack, and Dustin Ackley. Both Buss and Kipnis were juniors in 2008.
Correlation is the statistical correlation between each metric and career major league translated OPS, which I adjusted for BABIP and park effects. I used BaseballProspectus.com’s Davenport Translations for OPS, which accounts for year-to-year differences in league output. My Score is the system I have developed, wOBA* is park adjusted-wOBA, speed score is based on triples per hit, steal attempts per trip to first base and stolen base success rate with 75% being the benchmark.
The negative correlation for draft position indicates that the lower a player was drafted (50 being highest in the sample and 2 being the lowest) the more productive major leaguer he became. Given the volatility of player development and the sample size involved a -.311 correlation is surprising.
The percentage of base on balls per plate appearance has the highest correlation among the semi-conventional statistics I used. Players who walk a lot in college tend to be ones other teams pitch around, have more power (there is a 0.4914 correlation between walk rate and my overall power metric) and are more productive. Players who show an ability to sustain a high walk rate as underclassmen are among the best performing professionals, as they usually maintain that skill in affiliated action.
The correlation for strikeout percentage is -0.132, but upon further review strikeout percentage appears to have a non-linear correlation with pro success. A strikeout percentage over 17% tends to spell doom. Kelly Shoppach is the only player in the survey who struck out that often – about 19.5% while at Baylor – and has OPSed over 700. Those players who whiff in less than 14% of their plate appearances fare much better, the correlation between ML OPS and K<14% is -0.25318.
The key piece of information from the above table, is how well my system performs compared to both simple statistical metrics and the industry rating of players. The system I’ve developed has a 66% stronger correlation to major league offensive success than draft position.