floppymoose wrote:I'm glad you covered that mystic. As I went to sleep last night I knew that the predictability of WP would be brought up as another prong of the defense of WP, and I figured that FM passed that test, but it's nice to see you have run the numbers.
Before we (or anyone else) are using predictability as an argument, we have to understand where that is coming from.
There is only a limited amount of players anyway and from year to year the players with major minutes usually don't change teams. They are still in a similar role with the similar amount of minutes. Rosenbaum&Lewin 2007 is showing that. You can use minutes per game adjusted for positions and defense and it will have a higher correlation to the team's success than those other boxscore metrics. The high correlation is again controlled by the defensive adjustment. The rest comes from the defined role of that player.
Every player has his strength and his weaknesses, the coaches and scouts are able to determine that and will usually put those players into appropriate roles. Due to injuries, slightly changes in roles due to added rookies or free agents the values will change a bit, that's why the correlation will not be at 100%. But the value of 80+% is mostly explained by the fixed roles of the players. That is no different for any boxscore metric, WP falls under the same rules regarding this as FM48 is. A player with his primary role as a scorer will most likely be a scorer next season again. The numbers are showing it. The same goes for rebounders with limited offensive roles. Thus WP48 will still be rather constant from year to year.
Btw., the 2 or 3 year average values for APM are also pretty constant over the career of a player. The sample size is really a bigger issue here than with boxscore numbers, but even 1 yr values are useful, especially when they are combined with boxscore numbers like Dan Rosenbaum is doing it. Completely dismissing those values with the comment they are "inconsistent" is stupid, because that comment doesn't take into account why the boxscore numbers seems to be more consistent from the year-to-year standpoint.
If we understand all that, we know that predictability isn't characteristic of the model, but of the whole setting of the NBA. Each boxscore based model will have a similar trend regarding the year-to-year correlation. If there are more role changes for players within the league, the year-to-year correlation will be lower. If there are less, it will be higher. We could also add something like an aging curve for the players and we can probably improve the correlation a bit. But overall the year-to-year correlation isn't something which tells us anything about the model's ability to evaluate players. But the latter is something the coaches and scouts could use. To evaluate the performance of a player we can use those other metrics and can set them up regarding our preferences, with the right team adjustments the metric will still have a high correlation to winning and the predictablity will be similar to all other metrics.
FM48 is showing that, we set our preferences at scoring and we know that scoring per 100 possessions plus defense will give us a good estimation of the team's total wins. Using a linear model here is something rather simple, in the APBR community the pythagorean expectation is usually used with an exponent between 14 and 16.5. I determined the value on a dataset from 1980 to 2005 and came up with a value below 14 as the best approximation.
WP isn't doing anything else, they are using scoring per possession and opponents scoring per possession to determine the win%. Nothing wrong with that. But they are using two formulas to determine the possessions which are true for teams, but NOT for individual players. In fact the formula for PA isn't even defined in all cases. Not quite sure why NOBODY of the reviewers ever saw that. Berri is using a formula which isn't defined for the case that somebody gets an offensive rebound of a teammates miss and converting it right away via putback or tip-in. The whole underlying model is flawed. If Berri would have tried to publish such thing in a natural science journal, it would have been rejected.
And here is proof for this:
The formula PA = FGA+TO+0.47*FTA-ORB describes the amount of possessions. Points is the amount of positive credits someone gets for putting the ball into the basket. A close shot like a putback or tip-in will give a credit of 2 points. If a player attempts a shot, it will be counted as FGA. If a player grabs a missed shot, it will be counted as ORB.
To calculate the scoring efficiency (PPP), we have to do:
PPP = Points/PA
Assuming a player A is just in the game for an offensive possessions. One of his teammates takes a shot, but he missed. Player A is able to grab the offensive rebound and convert that right away for two points. After that sequence he is taken out of the game. His stats are 2 pts, 1 FGA and 1 ORB, everything else is 0. That means:
PPP = Points/PA
PPP = Points/(FGA+0.47*FTA+TO-ORB)
PPP = 2/(1+0+0-1)
PPP = 2/0
We are using a normal arithmetic setting, which means 2/0 is not defined. The model is invalid. And the statement is true, because we only need one example in which that model is not working.
And that is the real problem. Everytime a player gets the offensive rebound from a teammates missed shot, he used one less possession for his scoring. That is the reason we can find so many offensive rebounders on top of WP48. They are scoring points in Berri's model without having touched the ball. All those other things are minor and the defense thing is something every boxscore based model will face.
Well, I should probably just write a paper or a letter about that and publish it.