On Evaluating College Hitters Quantitatively

December 24, 2008

Forecasting Major League performance from minor league statistics is relatively common today, but what about taking the next step? How predictive are college offensive statistics? What is the best way to measure? Since this past summer, I’ve been tinkering with a comprehensive system for ranking college performance and projecting pro success – do I know how to party, or what?

Methods

I took every college position player drafted in the top 50 overall selections from 2001 to 2005 – before that college stats are hard to find and after that there isn’t enough major league data – fed them into my system and then determined the correlation between various metrics and Major League career OPS. For players who have not reached the majors, I used their most recent minor league season (min 200 ABs) to determine equivalent, park-neutral OPS. The result was 50 hitters with data from 117 player seasons and over 30,000 at bats.

My System

The basic idea behind my system is relatively simple: Different types of players develop in different ways. The system is based on the idea that players of different skill sets will develop in different ways. Focusing on year-to-year trends, I quantify each player’s skill set and then determine how likely that type of player is to succeed at the highest level. I’ve used past stats from drafted players to fine tune my system to better indicate future successes or failures.

For example, as important as power, strikeout, and walk rates are individually they are more important when judged together. Let’s look at former Billy Beane man crush Nick Swisher. Swisher had strikeout rates of 16%, 18% and 14% from 2000-2002 at The Ohio State University. Normally those relatively high percentages would result in a poor rating, but my system liked that Swisher showed really good power and an off-the-charts walk rate. Compare him to a guy who was generally thought of as a similar talent before the draft like Josh Fields, despite not being named in a bestselling book. While my system gave Swisher (512) a high rating, it gave Fields (403) a poor rating – anything over 480 is good – even though Fields had basically the same strike out rates (17%, 17%, 15%). The problem is that Fields didn’t walk a ton, or hit for much power. He just wasn’t productive. Swisher had a .512 wOBA* as a junior while Fields only managed a .361 wOBA*.

My system is comprehensive, relying on empirical production metrics. It looks at a player’s total body of work adjusting for age, positional, park, conference, and schedule effects, and then uses a system of variable weights to determine a player’s likelihood of pro success based on what players of comparable skill have done in the past. I weigh a players overall production, power, zone judgment, contact ability, speed and base running ability and I do so based on multi-season data.

2009 Draft Class

It’s way too early to give final grades on players, but my system indicates that performance as an underclassman is very important. Here are the top 31 draft eligible players for the 2009 draft based on 2008 performance:

Player	Pos	School	Score	Age	wOBA*	Power	BB%	K%	Speed
Josh Phegley	C	Indiana	526.47	20.8	.522	685.5	12.41%	8.03%	41.47
AJ Pollock	CF	Notre Dame	524.13	21.0	.360	442.0	9.60%	4.00%	432.72
Dustin Ackley	CF	UNC	516.26	20.8	.478	455.9	15.68%	7.99%	220.50
Jason Kipnis	CF	Arizona State	514.16	21.7	.408	705.1	16.67%	13.40%	322.50
Nick Buss	CF	USC	509.50	22.0	.408	481.1	12.50%	8.06%	282.07
Matt den Dekker	CF	U of Florida	489.16	n/a	.387	441.6	11.69%	10.33%	282.98
Alex Hassan	CF	Duke	486.33	20.7	.422	405.7	12.02%	7.36%	219.97
Marc Krauss	3B/OF	U of Ohio	484.07	21.9	.411	657.0	17.20%	11.10%	108.30
Kyle Seager	2B	UNC	482.00	21.1	.439	723.1	11.08%	10.13%	128.93
Jason Stidham	2B	Florida St.	480.74	20.8	.405	509.1	13.69%	9.55%	120.66
Andrew Clark	1B	Louisville	468.06	21.3	.405	476.8	12.27%	8.55%	183.65
Rich Poythress	3B	Georgia	462.77	21.3	.509	625.7	14.42%	12.54%	46.94
Ryan Jackson	SS	Miami	453.87	n/a	.360	411.9	8.36%	9.09%	150.33
Grant Green	SS	USC	446.17	21.2	.479	616.0	6.40%	15.49%	252.93
Josh Fellhauer	OF	CS Fullerton	446.01	20.7	.400	526.4	6.42%	12.16%	313.93
Russ Moldenhauer	OF	Texas	438.94	21.2	.417	432.0	8.99%	9.52%	24.64
Jordan Henry	CF	Ole Miss	436.56	20.5	.338	250.0	10.50%	7.00%	243.35
Ryan Ortiz	C	Oregon State	432.56	n/a	.480	509.3	13.06%	13.06%	52.95
DJ LeMahieu	SS	LSU	430.96	20.4	.310	326.8	7.12%	11.03%	145.90
Willy Fox	OF	Wake Forest	425.40	22.2	.384	445.6	8.98%	12.11%	279.67
Diego Seastrunk	C	Rice	423.92	20.9	.362	436.7	5.96%	7.01%	17.33
Robert Stock	C	USC	423.88	19.0	.374	339.0	10.60%	6.30%	70.90
Blake Dean	OF	LSU	423.14	20.8	.381	743.5	11.36%	14.93%	87.67
Brandon Belt	1B	Texas	407.03	20.6	.400	545.4	10.00%	13.70%	54.65
Kentrail Davis	CF	Tenn	403.67	20.4	.477	591.0	9.76%	22.76%	153.70
Brett Jackson	CF	Cal	403.18	20.3	.412	375.9	10.00%	15.00%	252.40
Ryan Jones	OF	Wichita State	402.17	20.6	.376	314.7	10.14%	13.85%	263.80
Chris Dominguez	3B	Louisville	396.21	22.0	.484	717.6	8.18%	16.72%	192.63
Gabriel Saade	2B	Duke	384.38	21.3	.387	593.6	11.97%	23.50%	342.48
Mike Murphy	3B	Maryland	366.33	n/a	.364	634.8	8.82%	16.39%	110.14
Blake Smith	OF	Cal	362.25	21.0	.421	680.7	6.66%	23.55%	186.26
		mean	419.20	21.0	.404	534.6	10.74%	14.00%	154.56
		stdev	63.85	0.7	.052	132.9	2.92%	5.59%	120.94

From 2001 to 2005, only eight players have scored over 500 as sophomores in my system. Six of those eight (Jed Lowrie 530, Ryan Bruan 523, Alex Gordon 581, Conor Jackson 547, Aaron Hill 504, and Nick Swisher 542) have generally met or exceeded expectations while just two of the eight (John Mayberry Jr 528, and Jake Gautreau 507) have not. Both Mayberry and Gautreau had poor junior seasons that knocked their overall ratings down. This bodes well for Josh Phegley, AJ Pollack, and Dustin Ackley. Both Buss and Kipnis were juniors in 2008.

Statistics

Metric	Correlation
Total Score	0.470
Draft Position	-0.311
BB%	0.261
wOBA*	0.227
NCAA OPS	0.185
IsoP	0.162
wOBA	0.163
K%	-0.132
Speed Score	-0.232

Correlation is the statistical correlation between each metric and career major league translated OPS, which I adjusted for BABIP and park effects. I used BaseballProspectus.com’s Davenport Translations for OPS, which accounts for year-to-year differences in league output. My Score is the system I have developed, wOBA* is park adjusted-wOBA, speed score is based on triples per hit, steal attempts per trip to first base and stolen base success rate with 75% being the benchmark.

The negative correlation for draft position indicates that the lower a player was drafted (50 being highest in the sample and 2 being the lowest) the more productive major leaguer he became. Given the volatility of player development and the sample size involved a -.311 correlation is surprising.

The percentage of base on balls per plate appearance has the highest correlation among the semi-conventional statistics I used. Players who walk a lot in college tend to be ones other teams pitch around, have more power (there is a 0.4914 correlation between walk rate and my overall power metric) and are more productive. Players who show an ability to sustain a high walk rate as underclassmen are among the best performing professionals, as they usually maintain that skill in affiliated action.

The correlation for strikeout percentage is -0.132, but upon further review strikeout percentage appears to have a non-linear correlation with pro success. A strikeout percentage over 17% tends to spell doom. Kelly Shoppach is the only player in the survey who struck out that often – about 19.5% while at Baylor – and has OPSed over 700. Those players who whiff in less than 14% of their plate appearances fare much better, the correlation between ML OPS and K<14% is -0.25318.

The key piece of information from the above table, is how well my system performs compared to both simple statistical metrics and the industry rating of players. The system I’ve developed has a 66% stronger correlation to major league offensive success than draft position.

Lincoln Hamilton can be reached at lhamilton@projectprospect.com.

View the discussion thread.

On Evaluating College Hitters Quantitatively

Our Top Prospects