DavidStern wrote:But how it works exactly? For example if rookie didn't improve enough then his value is lowered?
When you have a prior from last season and you apply the aging curve, you basically add or subtract a value based on age from that. The value depends on the respective age of the players. So, an 35 yr old +5 player may be lowered to a +4 player instead, while +2 22yr old, might get a raise to +3 (well, those numbers aren't correct, just an illustration to get the drift here). After that the prior is adjusted to the mean anyway and also the boxscore stats is applied to that before running the regression on the current season data.
DavidStern wrote:Does it also give better explanation of previous games played this season?
No, it can't. There is a bias introduced based on the choosen lambda.
The bias for the coefficients would be:
Code: Select all
bias(β) = -λUβ
where U = (X^(T)X + λI)^(-1)
and β = (X^(T)X + λI)^(-1)X^(T)y
β is the coefficent vector (the result)
X is the design matrix
X^T is the transpose design matrix
y is the response vector
DavidStern wrote:And what about offense/defense splits? I mean does it explain (and predict) offensive/defensive performance better than RAPM without box score?
Explain? No, but predict. Even though I'm not convinced that the defensive part is actually really showing a better prediction, at least in my test it suggest that it is not better in comparison to pure RAPM informed RAPM values. I personally wouldn't use the boxscore data, if I want to make a split between offense/defense. I tried it just based on the RAPM data on my merged ratings, but the results weren't as good as I hoped and I dropped that matter. From my perspective the overall number is the real meaningful number in order to make a prediction about the scoring margin (or get quantified information about the player's impact). In out-of-sample test my SPM (pure statistical +/- based on boxscore data) actually predicted the results better than RAPM informed RAPM values, while it couldn't match the predictive power in terms of offense and defense (both were off). Engelmann did the test with his best RAPM version at that time too and my older SPM version and came to the same conclusion. So, no idea why he went with the boxscore prior for the defensive part, but I guess it showed improved stability from yr-to-yr, which is something we can expect. And given that the yr-to-yr consistency of playing time for the majority of players is pretty high (coaches like using players in a similar fashion), the defensive value is just getting shifted from a better defender without the boxscore numbers to that boxscore number player (like Boozer getting a boost here and Noah/Deng went down). The negative effect by players changing teams is likely smaller than the positive effect coming from the increased stability.
So, from my experience the RPM value is more reliable than the splits, but those splits are nonetheless better than any other attempt available in public.