Sunday, March 27, 2011

Value Under Maximum

Ok, so I mentioned in my last post just a little bit ago the flaws of pVORP. After tinkering with the idea of a sort of "value under maximum" measure, here's what I've got:

The best way to boost your standing in a given roto category is to draft the player who will (or is projected to) lead the league in that category. If I need steals, then the best thing I can do is draft someone who will swipe the most bags in a season. The corollary is that any player who will not pace that category is doing so by some fractional amount. That fraction is then a good indicator of his value in that category. That means, for any given player, their worth in a category is their projected production divided by the projected maximum in that category.

But, as always, rosters are built around certain constraints. No catcher will lead the league in steals, for example, and it's unreasonable to expect them to. Why should I devalue a catcher who won't steal any bases when no catcher will steal many, especially since I have to play a catcher? A catcher who steals five bases isn't necessarily worth less than an outfielder who steals ten. Five steals from a catcher is gravy, ten steals from an outfielder means he needs to be doing something else well.

I could, instead, divide a players expected production by the projected leader at that position. However, that has the problem of all positions not being equal in production. Since a catcher isn't going to steal many bases, why should the best base-stealing catcher (who will get maybe 10 steals) have the same value as the best base-stealing outfielder (who will get 50ish steals)?

The simplest solution, if not necessarily the best, is to account for both of these facts with a straight average. A catcher may not steal many bases, nor should I overvalue the best base-stealing catcher, but a catcher who does steal bases is a bonus nonetheless. What you then get for a given statistic is:

x/[(max(p)+max(l))/2]

where x is a player's production in a category, max(p) is the best projection at that position, and max(l) is the best projection for the league (AL, NL, or MLB). Rewritten, you get:

2x/(max(p)+max(l))

You can do this for any statistic the league uses. It also has the following advantages:

1) It's additive. You can add together a player's value under maximum for all statistics and get a rough idea of the player's overall value. The score has no units and so you're not, for example, adding runs to steals. Since all statistics count equally, there's no reason to weight it or average it.

2) It works for positive and negative statistics. If you have a statistic that is counted negatively (such as batter Ks, where the fewest Ks means the most roto points) you calculate it the same way but simply subtract it when aggregating or look for lower values instead of higher ones within a position.

3) It's simple. You don't have to adjust for league size, roster composition, etc. The maximum is always the maximum. This is true no matter how many teams are in the league. The position- and league-best stats are constant no matter how many catchers, infielders, outfielders, utility, etc. you roster.

4) It's generative. You can use it for any statistic in the league with ease to come up with league-specific rankings.

5) It works for counting stats and rate stats. Because the number is essentially a percent value placed on production, the nature of the statistic doesn't matter.

6) It has a circumscribed range. The value under maximum can only be between 0 and 1. For the league leader in a statistic, his production is x, the maximum league value is x, and the maximum position value is also x. That gives 2x/(x+x) = 2x/2x = 1. So a value of 1 is the best possible production for the statistic. Similarly, if a player gets none of something (e.g. zero steals), then the numerator is zero and therefore the value under maximum is zero. By the same token, the range for the aggregate score is zero to the number of statistics counted (e.g. 0-5 in a 5x5 league).

7) It properly devalues pitchers. The conventional wisdom in roto leagues is that pitchers will not contribute in all categories, as closers usually don't get wins and starters don't get saves. This inherently accounts for that, since (for example) starters will have a zero value under maximum for saves. This means when ranking all players (pitchers and batters), the pitchers will generally range from 0-4 while hitters from 0-5 (in a 5x5 league).

8) It accounts for the decreased impact of any one category as statistics expand. As the categories increase, the weight of a player being poor in one category decreases. For example, if I have a player with a low batting average, that's 20% of his value in a 5x5 league. In a 7x7 league, that's only 14.3% of his value. Since the range of the aggregate increases as the number of categories increases, the amount that the low batting average factors in decreases as well.

9) Best of all, VUM is more or less a ratio statistic. Within positions, the VUMs are divisible, and between positions a VUM ratio works OK (though not perfectly). That means if Player A has a VUM that is 85% that of Player B, then Player A is (in the aggregate) about 85% as good as Player B for fantasy purposes (especially if they play the same position).

So there you have it. Instead of using pVORP, I'm going to calculate my roto rankings using Value Under Maximum (VUM).

No comments:

Post a Comment