Post#62 » by **mysticbb** » Wed Feb 9, 2011 1:07 pm

Sleepy51, even if he would account for defense better, he wouldn't improve his model enough. As I pointed out the team adjustment brings his correlation coefficient to 0.94 and without that it is 0.71. He is justifying his model with that high correlation to winning. The defensive adjustment is basically justified via a residual. Well, Berri is not directly using a residual, but overall it is just that. It is also not accurate to say he got his whole model via regression, because he only did two regressions. The first regression he did was to determine A, B and C for the following formula:

Win% = A*ORTG+B*DRTG+C

A, B and C are presented on his page as the coefficients next to offensive efficiency, defensive efficiency and constant term.

What he is now proposing are those two formulas for PA and PE. At this point he is not using a regression to determine the marginal values, he is using league average values. The differentiation gives us:

Marginal value PTS = A/PE

Marginal value PE = A*(-PTS/PE^2)

Marginal value DPTS = B/PA

Marginal value PA = B*(-DPTS/PA^2)

The values for PTS, PE, DPTS and PA are the league average values in his data set. If you use a different data set you will get slightly different values.

In the next step he is using the marginal values and is assigning them accordingly to boxscore stats which are related to PTS, PE, PA and DPTS. You can easily see that by the numbers for free throws. (1-0.47)*0.032 = 0.017 and 0.47*-0.032 = -0.015. Well, by chosing 0.47 as the factor for free throws he set the break even point for free throw shooting to 47 ft%. Nicely done. That just means a player can shoot 48 ft% on his free throws and he will help his team winning more than a guy who is shooting 49 fg% on his two point field goals. In fact the latter will hurt his team while the other will help.

At the end he is coming up with the conclusion that a player has to shoot over 50% from the field on his two point shots to help his team winning games. That is above the league average.

Where does the problem comes from? The formula for possession employed works for the overall team, but not for the individual player. Adding up FGA+0.47*FTA+TO-ORB will not give us a good estimation of the possessions a player used. To understand that: If we applying this formula to a player like Love, we are getting right now a 152.8 ORtg for him. He scored 1111 points with 777 FGA, 351 FTA, 120 TO and 247 ORB. According to Oliver he has 123 ORtg. Now, if we look at a guy like Dirk Nowitzki who is scoring at a higher rate and with a higher efficiency while turning the ball over less, we get 114.4 ORtg, while according to Oliver he has 118. Or my new favorite upcoming franchise player Kris Humphries has 141.7 ORtg, if we apply Berri's formula for offensive efficiency.

As you can see the problem is not only defense, the problem is that the formulas Berri is using can't be applied on players. They will give you absurd numbers for guys who are getting offensive rebounds. Well, if you add those numbers up for each player on each team, it will for sure give you exactly the teams overall number, but that doesn't mean that the distribution among players is correct.

Btw: If we are using the same methodology as they used in the last GSW blog entry, we get 23.0 wins via WS/48 and 22.7 wins via my PRA as the prediction. Both are way closer to the reality than WP48. Not surprising at all. Both are also more consistent from year to year, the same goes for PER. I could calculate the average PER for the teams over time and make a regression to estimate the linear formula WIn% = A*PER+B, but well, I have the feeling it is also closer to the reality anyway.