Friday, March 25, 2011

Calculating value in a roto league

Calculating value in a points league is easy. Because each statistic counted has a specific value, there is an equivalency between statistics, and all that matters is points accrued over a given period of time (game/week/matchup/season). This is not the case in a rotisserie (or for that matter H2H categories) league. Statistics are not interchangeable, since the number you accumulate in each contributes to your team points in a different way. In a points league, for example, runs are one point each and RBI are one point each. If I get 1,800 combined runs and RBI, the proportions don't matter since all possible allocations (from 0 runs and 1,800 RBI to 1,800 runs and 0 RBI) come to the same point total. In a roto league, the distribution matters.
In 2009, the last year for which I could find the relevant information (here), it took 1,189.2 runs to win the category in a 10-team league (with 13 starting hitters) and 1,168.2 RBI to win that category. That means if all 1,800 are in one category or the other, you net 11 points (first place in one and last in the other). An even split (900 and 900) will only net you 2 points, since neither 900 runs nor 900 RBI break the 2nd place average in those categories. You can (approximately) calculate the optimal distribution, but that's beside the point. The idea is that the distribution of those statistics matters in roto but not in points.

Since categories are not interchangeable, you have to calculate value in each category separately. You can do this using a simple VORP formula, which is expected production (x) minus replacement production (r) divided by replacement production (r again). This yields the typical (x-r)/r formula. But you can't simply calculate replacement value in each category for each position, either. Much as points are points no matter how they are accumulated in a points league, stats are stats in a roto league no matter what position they come from. Calculating value by using only typical position production leads to wildly asymmetrical results.

As an example, let's focus on home runs and steals. At position A, replacement level is 30 home runs and 2 steals. At position B, replacement level is 15 home runs and 20 steals. If I get 30 home runs and two steals from position A and 15 home runs and 20 steals from position B, the aggregate VORP is zero for each stat at both positions. However, if I get 15 home runs and 20 steals from position A and 30 home runs and 2 steals from position B, the VORP values are this:

Home Runs A: (15-30)/30= -0.5
Steals A: (20-2)/2 = 9
Home Runs B: (30-15)/15= 2
Steals B: (2-20)/20= -0.9

Adding the VORP values together in this case gets you 9.4, which is obviously very different from zero. Nor are the VORP values multiplicative, since in each case you get -1.

What I would do instead is calculate a partial value over replacement (pVORP). Essentially, you take a roster full of replacement players at each position and calculate what the total replacement value is in a category. In the above example, total replacement home runs is 45. You then divide straight production above replacement (x-r) by total replacement home runs. So at positions A and B, typical production still gives you a pVORP of zero (production above replacement is 0, and 0 multiplied by anything is 0). Atypical production gets you a pVORP of (15-30)/45= -15/45= -1/3 at position A. At position B the atypical production gives you (30-15)/45= 15/45 = 1/3. Add them together, and you get zero again. Since you're getting no value over replacement, this is what you want.

(As an aside, the reason it's important for pVORP to be additive is that the statistics themselves are additive. Even the rate stats are additive, insofar as they are calculated by adding up the individual values for the team in both the numerator and the denominator. For example, the batting average for two players is the sum of all hits divided by the sum of all at-bats).

You then essentially do this for every statistic, hitter or pitcher, that your league uses. In a later post, I'll give you the values for 5x5 statistics plus several other common ones (e.g. quality starts, holds) calculated for a 10-team league. It gets complicated, especially at pitcher where many leagues use undifferentiated spots, but I'll see if I can't make some simplifying assumptions.

No comments:

Post a Comment