First, I should point out that these responses are primarily to what was written. I didn't click on any of the data spreadsheets. That said...
*
Really like the approach of offering a description of the different types of "advanced stats", how they're calculated, and some grounding on what they actually measure. That's so important to having a good discussion. Many tend to use the stats like a volume game, so if one stat disagrees with another they just cancel out in their minds (or worse, as you point out, many just pick the one that tells the story they like). So it's really cool that you started off by defining the tools and helping set the boundaries for the discussion.
*Like Doc MJ, I'm not sure about your decision to go kind of "all or nothing" with the different +/- approaches as opposed to using elements of each to help address weaknesses, where possible. I can see why you made the decision you did, especially if you're approach was to bring this to a strictly mathematical decision. When you do that, you have to make choices on which to include and which to not include. However, I tend to think a more complete story could be told by using some of the approaches you discard (such as multi-year APM and single-year prior-informed RAPM, for example) to help tease out your point.
For example (very general examples off top-of-my head, so I hope this doesn't end up being straw-man-ish), single-year on/off +/- data is pure, but can be heavily teammate/situation driven. Multi-year RAPM may be impacted by trying to assign someone an average value over multiple-years when their value may have changed several times over a given stretch. But the prior-informed, yearly RAPM scores may help to give some added yearly granularity that is more player-focused than the raw on/off +/- that helps characterize the multi-year RAPM results and make them make more sense. Or, in a different situation, maybe there is some concern that RAPM's tendency to minimize outlier impact might be falsely affecting a player's score. But if there is a multi-year APM study that covers that same period, we can get the perspective of a stat without that "outlier dampening" that might help us get a clearer picture.
*I'm not sure about your handling of older players like Oscar or Russell. Again, I can see the difficulty of producing a monolothic, statistically-driven list for the older players because much of the granular data we have for the current era just didn't exist back then. And the game is so different as well. It's very difficult to do. But again, when building my own mental framework, I don't have to be quite as rigorous and am freer to take whatever information I can find (whether it fits the statistcal models I was otherwise using or not) and putting it through my own mental filter. This of course opens us up even further to subjectivity and narrative, but in cases like the "old folks" I almost feels like there's no other way to do it.
In the case of Oscar, for example, you point out some of the stylistic elements of his game and give a reasonable thumbnail scouting report of the way he played and how it might translate. But the thing is, his approach was optimized for the time that he played, and there's no really great way to quantitatively say with any certainty how his game might have evolved if he came up in a different era. I'm fine with acknowledging that, and in cases like Mikan or something where there are huge phsyical/player availability issues I can see giving bigger "penalties" for this, but in the case of someone like Oscar who (using your examples) appeared to have floor generalship abilities similar to Paul or Nash but with a physical body type closer to Pierce's...I don't see anything to disqualify him from evolving his game to fit the more modern era. We just don't know. But (to me, more germanely) you also don't seem to attempt any kind of "impact" approach for Oscar. We don't have +/- data from the 60s, obviously, but team-based offensive performances of Oscar's offenses measure out as some of the GOATs, and Oscar's individual WOWY scores also measure out at GOAT-like levels. While neither of those metrics might have a slot among the metrics that you used, to me they are the best impact-data approaches that we have and to me they indicate that Oscar's in-era impact, particularly on offense, very well may have outsripped even Nash. So I wouldn't feel comfortable at all downgrading him out of contention for a top-10-ish ranking based on what I read of your arguments.
*I appreciated your argument for Kobe over Paul in 2008. It's an example of where single-year RAPM was a benefit. In 2008 I tended to feel like Kobe was a better player than Paul, but I couldn't find the evidence in the data to fully support it like I'd like. When we did the RPoY project, the RAPM data wasn't available yet and the arguments for both sides were so good that I ended up being torn on who was better. But the RAPM data, when it came out, was one of (the?) first dataset(s) I saw that clearly showed separation in Kobe's favor, and it was an impact-based measure which ran orthogonal to all of the boxscore-based approaches that favored Paul.
*I'll admit that I only barely skimmed the Kobe/Durant section, but I do agree with your bottom line. I've got my own marathon post floating around from Durant's peak season where I argue that even Durant at his best was never as good as the best we saw from Kobe.
*I'm iffy on your handling of playoff data, for similar reasons to how you handled the "old folks" data. Due to lack of regressed +/- data, and small samples of individual season raw +/- data, I again understand why your quantitative approach might not include much "impact" data. However, as you actually allude to later on in an argument for Kobe's defnse, we actually have playoff +/- data for the past almost 20 years that over years adds up to pretty significant levels of on/off data for superstar players. Thus, your playoff performance tab which (I believe I read you say) focuses almost entirely on the change in boxscore production from regular seaosn to playoffs...I'm just not sure how effective that approach is. As you point out in the rest of your post, the boxscores just aren't enough. While lacking any playoff +/- data at all from before 1997 is problematic in handling older players, since Kobe is the focus of your post I think it'd be fair to include some of the playoff +/- data for him and his contemporaries as a foil to the boxscore data comparison that tells a more full story.
*I don't think your modifications of KG's score were justified. Just in broad strokes, you base it heavily on your assumption that a) Troy Hudson couldn't possibly be as bad of a defender as the multi-year RAPM study suggests, b) that the multi-year RAPM over-emphasizes KG's Boston defense in the overall score, and c) that KG's score was overly inflated because of Thibideaux's defense. Addressing those individually:
a) Troy Hudson's defense. Paraphrased, your contention is that the 02 - 11 DRAPM dataset characterizes Hudson is a -2.6, the worst defender of the era, and that this is a specific type of error based on Garnett getting too much defensive credit carrying over from Boston. You support that by saying Hudson couldn't have been that bad of a defender, and scaling his defensive score back from -2.6 to either -1.6 (where Deron and Terry measure out) or even to -1 (where Telfair, Smush, and Calderon measure out). And if your assumption is true in that specific way, e.g. that Hudson only measures out that badly in the 10-year RAPM dataset specifically because he was downgraded due to Garnett getting a boost from his Boston time, then that justifies adjusted Garnett's DRAPM score from that 10-year dataset by 0.3 points or so. Is that a reasonable synopsis?
If so, some points for you to consider. First...yes, it's very possible that Hudson really was that bad of a defender (more on this below). And second, and more germane to your statistical argument, the available studies of the time (and therefore BEFORE Garnett went to Boston) ALSO concluded that Hudson was very likely the worst defender in the NBA during the 2002 - 2005 time era.
So, Hudson's defensive scouting report thumbnail. Hudson was physically very limited, short and extremely slight (listed at 6-1, 170 pounds). He had a narrow frame that didn't appear to hold much muscle, and that translated to a player that just couldn't/didn't fight through screens ever. His lateral movement was ok when healthy, but his length was crap and his defensive instincts sucked so he was very often out of position and unable to contest shots. He could be posted up, he could be driven upon and finished over/through...there just wasn't very much that he was good at defensively, even when healthy. Then, in 2003 he suffered a terrible ankle injury that drastically limited his mobility even out through 2005...which made his only possible redeeming quality (that he was decently quick) into another weakness.
Now, translate to the available studies of the time that would in NO WAY be affected by Garnett's time in Boston. Using Doc MJ's scaled RAPM spreadsheat, from 2002 - 2005, these were Hudson's scaled DRAPM splits:
2002: -4.93, 8th worst in NBA
2003: -2.38, 47th worst in NBA
2004: -4.43, 9th worst in NBA
2005: -5.20, T3rd worst in NBA
A few notes on the preceding. First, if you look at 02, 04, and 05 there is no player in the NBA that was worse than Hudson in defensive RAPM in all three seasons. There is only one that was worse than him in two (Michael Redd), and if you go back to your list of worst defenders according to the 02 - 11 DRAPM list, you'll see that Redd ties for 2nd worst defender just ahead of Hudson. Another important note: the terrible 2002 score came before Hudson even got to Minnesota, so no way it's a KG-dependent effect. Actually, the only year on that list that Hudson wasn't league-worst defender was the only year that he was both healthy and playing with Garnett. Considering that Garnett specialized in PnR defense, and Hudson sucked getting through picks, that would actually be logical. But there's another option as well...that, as you pointed out before, RAPM actually pulls outliers back towards the mean. So it's possible that Hudson is actually an even worse defender than RAPM shows, but that he was regressed more towards decent.
To that end, I point out Rosenbaum's APM study from 2005:
http://www.82games.com/rosenbaum3.htm . In that study, Rosenbaum goes position-by-position and looks at the top-10 and bottom-10 defensive APM scores from 2005. It also lists the 2003 and 2004 defensive APM scores for each player. First note actually deals with the KG/Duncan/Wallace triad you mention later. According to Rosenbaum's pure APM, Garnett finished with better defensive APM than Duncan in both 03 and 04, and his 7.5 mark in 03 was better than any single-year mark that either Duncan or Wallace put up in that 3-year period (this is counter to what single-year RAPM says for those years, so just food for thought). But more germane to Hudson, his defensive APM scores in 03, 04 and 05 were INSANELY bad. Here is Rosenbaum's summary on Hudson as a defender in this period:
"Troy Hudson probably gets the award for the being the worst defender in the league. He is dead last among point guards in both the statistical and adjusted plus/minus ratings and his adjusted plus/minus ratings are consistently horrible. He is playing a game on the defensive end that is not remotely like anyone else’s in the league."So, bringing it back to the point, yes, Hudson scouted out as very possibly the worst defender in the league. And looking at repeated single-year measurements for either APM or RAPM from the time (e.g. before KG went to Boston) argues strenuosly that yes, Hudson was really THAT BAD on defense. Most importantly, it invalidates your premise that it was some sort of Boston-based Garnett effect that caused Hudson to show up Horrible in the 10-year dataset. Hudson shows up THAT BAD defensively in any +/- based defensive study ever conceived, completely independent of Garnett's time in Boston.
b) That the 10-year RAPM study over-emphasizes Garnett's defense due to his time in Boston. You cite that the raw and RAPM defensive RAPM scores from 02 - 07 favor Duncan and Wallace a bit over Garnett, defensively, but that from 08 - 11 Garnett's scores are favored. I'm not going to spend too much time on the numbers here, so if you have a good rebuttal it could spark a discussion that would make me go back through the numbers. But I would argue that if you're right that Garnett's defensive 10-year RAPM may be over-estimated by his time in Boston, I would contend that his offensive RAPM is likely under-estimated for the exact same reason. In 2003, Garnett finished 2nd in the NBA in offensive RAPM and in 2004 he finished 1st. His single-year offensive RAPM in that time period was actually larger than his defensive RAPM. But in the 10-year study a lot of his value shows up as defensive. So I would say that it's not justified for you to in any way lower Garnett's overall RAPM score in your metrics by correcting only for defense, you'd have to spend as much effort into correcting his offense upwards.
c) You contend that Thibs' coaching schemes should count against Garnett's defensive RAPM scores. A few rebuttals. First, as you pointed out, while Thibs' team defensive rankings in Chicago were similar to Boston's, there was no defensive RAPM footprint on those Bulls even remotely similar to Garnett's in Boston. Despite having an (as you say justified) DPoY in Noah while playing in Chicago, there was no Garnett-like footprint. And if you apply the same "Thibideaux correction" to those Bulls teams as you do to Garnett, then team-leader Deng's defensive RAPM scores become pedestrian and DPoY Noah's defensive RAPM scores would fall completely into the noise. Without being overly rigorous, your approach seems problematic.
But going further, there is counter-evidence to your contention (that Thibs' defensive schemes inflated Garnett's DRAPM in a way that wasn't reflected in reality) that you don't consider and/or weigh much. First, the fact that Thibs was no longer coaching in Boston in 2011 or 2012 deserves more than a cursory blow-by sentence. Yes, the Celtics still used many of his schemes. But coaching is about much more than schemes, and players require active adjustments. If all that was needed was scheme, then the rest of the NBA should catch up pretty quickly, right? I mean, who wouldn't use a defensive scheme that was almost cheatingly ahead of everyone else? So then why, when Thibs left Chicago and they brought in a new coach, didn't the new coach utilize Thibs' defensive schemes and keep the team near the top of the league defensively? Noah is still there, the players remember the schemes, and by now (8 years after the 08 Celtics run) surely the rest of the league would have absorbed enough of Thibs' landscape-altering schemes that the Bulls should have been able to remain a top defense after he left. But they didn't. So for the Celtics to be a top defense for 2 full years after Thibs was gone argues that something else was going on there.
Another line of reasoning to consider is that even while Thibs was coaching in Boston, the defense flat didn't work without Garnett. From an old post of mine I found from 2011:
drza wrote:The thing is, because of Garnett's injuries in the past 4 years we can test exactly how the Celtics have played with and without him with a huge sample size each way. We also have a huge sample size with the starting unit without Perkins. I spent some time looking through 82games.com's 5-man units and this is what it told me about how the Rondo/Allen/Pierce units have played with every combination of big man the Celtics have had:
Garnett and Perkins: 112.4 points/100 possessions, 97.3 points allowed/100 poss
Garnett w/o Perkins: 111.9 points/100 possessions, 99.3 points allowed/100 poss
Perkins w/o Garnett: 109.5 points/100 possessions, 112.1 points allowed/100 poss
Now, let me be clear. Since Garnett arrived in 2007, the Celtics' main starting group (Rondo, Ray Allen, Pierce, and Perkins) in a Tom Thibideaux defense have given up 112.1 points/100 possessions when any other player besides Garnett was the 5th player on the floor with them. Just for clarity, the worst defense in the NBA this year gave up 112.7 points/100 possessions. And again, we're talking huge sample sizes here, from well over 200 games that Garnett has played in and 60 that he hasn't over the past 4 years. Conversely, with Garnett in the the line-up (with or without Perkins) the starting unit has given up 13 - 15 fewer points per 100 possessions. "
Yes, Thibs is a defensive genius and deserves props for his teams' defensive performance. But that Celtics defense was tied directly to the presence or absence of Garnett. When he was on the court, the defense was elite (at times historically so). When Garnett was off the court, even when Thibs was still around, the defense fell off a cliff. Thus, what RAPM tries to measure...the correlation between a player's presence and the team's scoring margin...was still measuring as accurately as it ever does. Garnett's presence DID correlate to huge changes in the defensive scoring margin that was independent of Thibs' presence.
Bottom line: I thought your OP had a lot of great stuff in it. Even if I don't agree with all of your approaches, it was a gargantuan effort. I do note some of the weaknesses that others have pointed out with respect to your tone and the sense that you had some of the same subjective/arbitrary weaknesses that you decried in others. Someone pointed out that you claimed PER to be designed for LeBron when he wasn't in the league yet...I've never been able to re-find the article, but I remember years and years ago reading that Hollinger actually designed PER for Jordan. That he wanted to come out with the combination of parameters that made Jordan's seasons measure out as the GOAT. Take that for what it's worth, but I'm almost positive it's true. In any case, the negatives of your OP don't outweigh the positives for me. I'm not convinced that Kobe is definitely a top-10 player, but in spite of any warts I think the post has good value for the informational content and the effort it took to create it.