Tuesday, April 5, 2011

SWIP analysis, Part I

I took a quick, back-of-the-envelope look at how SWIP (strikeouts minus walks per inning pitched) stacks up against K:BB ratio, currently the gold standard for control.

(Due Credit: SWIP was devised by the fine people over at Baseballguys.com; link to their posts on it here)

For the firs part of this analysis, I just ran some quick bivariate correlations (only factor in two variables: SWIP or K:BB and the statistic of interest) for both SWIP and K:BB ratio for starting pitchers. I took every starting pitcher who threw at least 100 innings from 2006-2010, looking at each pitcher's season separately. The results are interesting.

First off, SWIP and K:BB correlate fairly well, to the tune of .8192. A high SWIP tends to predict a high K:BB ratio, and vice-versa. This isn't surprising; as Ks increase both SWIP and K:BB increase; as BB increase both SWIP and K:BB decrease. They don't match up perfectly, but matchup well. As such, they are at least in part measuring something similar.

Here's how K:BB ratio varies with several statistics:
Wins: .382
Losses: -.130
Innings Pitched: .352
ERA: -.488
WHIP: -.678
K: .567
BB: -.406
HR/9: -.199
OPS against: -.499

These are some decent correlations. Finding correlations of this magnitude in any data set is really nice. Now here's how SWIP stacks up against those same statistics:
Wins: .398
Losses: -.152
Innings Pitched: .351
ERA: -.565
WHIP: ..690
K: .787
BB: -.116
HR/9: -.256
OPS against: -.616

SWIP predicts all of these statistics better, except for BB (which has more of a direct influence on K:BB ratio than SWIP).

Without inundating you with numbers, SWIP does a better job predicting ground ball/fly ball/line drive rates as well. They all correlate negatively for both, but the absolute magnitude of the correlation is two to three times as large for SWIP as it is for K:BB ratio for each of those.

SWIP is also better than K:BB ratio in predicting fantasy points (.664 vs. .583) and record-independant fantasy points (.729 vs. .627). It predicts value under maximum as well or better for all four major roto categories; this is especially pronounced for, of all stats, strikeouts (a whopping .787 correlation for SWIP, .567 for K:BB). It also better predicts overall starting pitcher VUM (.703 vs. .608).

To put it bluntly, the predictive value of SWIP far exceeds the predictive value of K:BB ratio. K:BB ratio is the Miss Cleo to SWIP's Nostradamus.

I'm not entirely sure about the reason(s) for this. I'm not sure that, as has been posited, that SWIP measures a pitchers control of the strike zone better than K:BB ratio. In fact, I think K:BB ratio is explicitly a measure of controlling the strike zone. Ks increase when the pitcher fools the batter, BBs increase when he cannot do that (whatever those reasons may be). K:BB ratio is then an advantage ratio measuring how many times a pitcher uses the strike zone to fool the hitter over how many times he doesn't. It's easier for high strikeout guys to get a good K:BB ratio, but not necessary. Among the top 50 of the past five years are 2008 Kevin Slowey, 2008 Mike Mussina, 2007 Curt Schilling, 2006 Roy Oswalt, 2006 Jon Lieber, and 2007 Greg Maddux. All of these pitchers had a K/9 rate under 7.00; Maddux's rate was under 4.75! These are all pitchers who are known for pinpoint control and the ability to set up hitters over multiple innings.

Now let's look at these stats as they relate to SWIP. The way I see it, walks are a measure of how many men the pitcher exclusively puts on (less home runs), while Ks are a measure of how many men the pitcher exclusively gets out. SWIP then, is the net advantage the pitcher (and only the pitcher) is responsible for scaled for innings pitched. A pitcher who doesn't get strikeouts and doesn't walk isn't going to have a great SWIP because he's not, in isolation, creating an advantage for the team. The lowest K/9 rate among the top 50 SWIPs in the data set is 7.7, while most of the list is above 8.5 K/9. For a high SWIP, you must strike out batters because, otherwise, you cannot create an advantage in the game.

If K:BB ratio is measuring control and SWIP is measuring advantage then it would make sense SWIP has more predictive value, especially for fantasy. If a pitcher is taking control of the game, then the relevant fantasy stats (depending on the format, IP, K, W, WHIP, ERA, etc) all must be better. If a pitcher is controlling the strike zone, then the same stats are only likely to be better.

For example, Tim Hudson is a control, pitch-to-contact, groundball-inducing pitcher. He has a solid K:BB ratio the three years he's in the data set (2-2.5) but a low SWIP (.284-.352). His fantasy value is highly variable in this time frame; depending on the metric it varies by as much as 40%. Moreover, his worst fantasy year isn't his worst year for either SWIP or K:BB, but the low SWIP is indicative of the possibility for that kind of variation. This is the sort of thing that can happen when, instead of creating an advantage yourself, you must rely on the defense to help you. If you watched Atlanta's 2010 playoff series, you saw how bad that can get. During a similar time frame, Matt Cain has a comparable K:BB ratio but a higher SWIP. Not only is his overall fantasy performance better, but the performance is less variable (more like 30-35%).

I think K:BB ratio and SWIP measure different things. I think SWIP, since it seems to measure the advantage the pitcher is solely responsible for, has better predictive value for real baseball and MUCH better predictive value for fantasy. I'm sticking with SWIP.

No comments:

Post a Comment