eminence wrote:ShaqAttac wrote:doesnt this mean colineary is overrating all of chicagos top players
Not what collinearity does. It makes us less sure of our results, but it can't just push the whole variable group up (or down) in the overall model.
In that particular case, I feel pretty confident saying it's likely Armstrong being brought along for the ride, not MJ/Pippen* - their numbers would be somewhat depressed by BJ's impressive result.
*nobody necessarily needs to be 'brought along' either, models can be reasonably accurate (in terms of telling us which variables are having what impact) in spite of collinearity
The overall model accuracy is not hurt by collinearity, though it does make it more likely to overfit your model.
The idea that collinearity is bringing Armstrong up significantly (while slightly depressing MJ/Pippen) gains more credence when we look at the non-Bulls BJ sample.
Collinearity becomes an issue when player's with / without sample correlate strongly. BJ missed literally *1* Bulls game from 1990 to 1995. Pippen missed 14 games from 1990–1995, and Jordan missed 6 games from 1990–1993 (though he did miss more in 94/95). So it's not a lot of missed games to isolate BJ's value with.
Then we get to BJ's 1992–96 sample. Suddenly we go from having at most 1 missed game for BJ to a full season of missed games for BJ (he missed the full 1996 season for the Bulls, the full 1995 season for the Warriors, etc.) And BJ drops from a constant 90th percentile or better, often skirting with the 99th percentile, to a perennial 50th percentile player. Is BJ suddenly getting worse by trading his 23 year old season for his 28 year old season? Unlikely. Even more unlikely when we consider his box stats are all better as a 28 year old vs 23 year old.
Instead it seems like having a larger off-sample makes it clear that BJ wasn't the main cause of the Bulls' success -- this would support the idea that collinearity was boosting BJ's numbers in the early 90s and downgrading the other Bulls' numbers more than the 'true' value.
...
A similar argument could be made for Grant, though to a lesser extent. Grant looks like a strong positive over his full prime, and it makes sense that two of the Top 20 teams ever (91 and 92 Bulls) would require strong supporting players outside their GOAT-tier star. But exactly how positive was he in the late 80s and early 90s?
The issue with WOWY data is that we typically don’t have a large off-sample, which can lead to massive uncertainty bands. One thing you’ll notice in a lot of these ‘career curves’: Many stars seem to look better in their first or second sample (when we have a sufficient-off sample for them) than they do in their fourth or fifth sample (when we have a much smaller off-sample). Shaq, Garnett, AD, Sam Jones, Wilt, Nate Thurmond, Kareem, Larry Bird, Jordan, Stockton, Rodman, and Drexler all have this shape. Are we to believe that all these players are getting worse after their first year? Doubtful. Instead, I think this is partially explained by having a declining off-sample size making it harder to accurately pin their value.
Other players seem to grade worse once we get a larger off-sample. Grant is one of those players. Grant’s first three samples hover around the ~80th percentile, when we have a large off-sample for him. He then jumps up when we trade 1986 (off sample) for 1991 (all-time-level team). Makes sense. Now some of this is likely Grant himself improving. But some of this may be boosted by collinearity with the other improving Bulls members, and a lack of an off-sample to effectively single out Grant’s contributions. Grant looks great from 1987–91 to 1990–94, when playing with a great cast and without much of an off-sample.
Then we get to 1991–1995, when Grant was traded and we gain full-season length off samples for Grant… and suddenly Jordan looks significantly better than Grant (Jordan goes from 17–20% better in 89-93/90–94 to 57–63% better in 91–95/92–96). Is Grant suddenly getting worse at the age of 29, in the middle of his career when he played until the age 38? I’d say it’s more likely that the larger off-sample allows us to helps the model limit collinearity, and see that Jordan (and to a lesser extent Pippen) are significantly better than Grant, although Grant is still absolutely a positive contributor.