An Unbiased Fan wrote:I would agree that evaluating defense is harder. The question is why RAPM is the answer. Vlade Divac is rated higher than Ben Wallace, Bogut, and Dwight by it.
97-14 DRAPM:
Divac - 2.60
Big Ben - 2.39
Bogut - 2.35
Marc Gasol - 2.09
Dwight - 1.96
To me, its clear that Divac was just part of more defensive rotations due to the King's lack of size in their roster. Which illustrates why RAPM is fine for coaches to analyze their lineups, but not for individual impact. No two players have the same role, roster, team system, rotations. There is no mechanism to extract the individual from the group.
This helps me understand a little more your issue with "individual impact". To me you're building the word up into some thing more than it is.
If I have a player who I can insert into a lineup and the result will be an improvement of a certain amount, "impact" is a perfectly reasonable choice of words for the effect his insertion will give me. I also tend to use the word "lift" a lot, which I don't recall if it bothers you, but really the point is that it's utterly reasonable to use some word here, and it hardly makes sense to to coin some random word for it. The process of creating new word senses for more common words as we begin to talk in more granular detail in a discipline is what it is, and really the only time to object is if there's a fundamental problem with the words being used because other words would be better.
I see a pattern here where you - and many others - don't allow things to fail gracefully. If I link RAPM and impact, being fully aware that correlation is not causation, I don't expect the connection to be free from noise. I push forward with it nonetheless as a thing I expect to be imperfect, whereas you're inclined to show the counter example to prove they are not the exact same thing, and then not use it to anywhere the extent I do.
Which is the right way to do things? Well, my way naturally.

Obviously I don't expect you to simply accept that, and we probably won't come together on this, but I'll say a couple things:
1) Think on the concept of the "prior" being used in RAPM. Basically the R in RAPM amounts to infusing the sample set with data that will probably smooth out the fluky data. In doing so we lose some validity, but gain reliability, and in practice in many contexts that still makes for a more powerful tool.
Now, in our lingo here "non prior" actually means a prior where we assume pure neutrality. No biases at all toward any player! Add in zeroes everywhere! Why is this not the most effective way of doing things? Because true "no bias" would mean putting in the right value, and a pure neutral infusion of data deviates from what's actually right. If we can use some other method that will probably lead to something that deviates less from what's actually right, then most of the time we'll get a better result.
And this is what I'm doing when I'm using RAPM as a starting place for impact analysis. I'm starting with some level of confidence that the data is meaningful, and going from there. I look at plenty of other things to - orthogonal data, observations, reputations, I look at aspects of the lineup allocation that could influence it, etc. The point is though, there's never a point where I throw the thing out simply because it disagrees with something else. To do that is to rationalize one's existing opinion, and if one's willing to do that in one place without qualm, then one may end up doing it everywhere.
So you bring up Divac with regards to Kobe Bryant, but Divac has nothing to do with the fact that year after year after year we can't find any major correlation between Kobe and his team's defensive success. You bring up Divac only as a way to essentially say "So it might just be coincidence", and that's just not good enough. Maybe there's something in there that essentially makes Kobe unlucky again and again and again by this metric, but dayum, that's astonishing if true. So astonishing in fact that it's pretty clearly not the most likely "prior" if you're following my analogy.
Now really focusing on Kobe a bit here with regards to this stat: There is indeed a specific issue with separating offense and defense with +/-, and if you really want to focus on that as the explanation, I'd be interested to see where you went with it. The reason why it's tough to take that argument so seriously though is that it very clearly to a "same difference" kind of thing. If the lineup focus is deflating Kobe's apparent defensive impact, it's also inflating his offensive impact accordingly, and people who scoff at the notion that Kobe's defense is overrated are no less likely to scoff at the notion that his offense is overrated. In the end, this stat makes Kobe look fantastic, just not as fantastic as some are inclined to think.
2) Everything I'm talking about here in terms of accepting and using an imperfect tool, it all is basically taken as a starting point by people are real data analysts. The whole notion of regularization, the use of priors, the fact that regression analysis provides nothing but correlation, and the fact that we are forever attempting to bridge correlation and causation...that's what data analysis is. That's what science is with any kind of complicated data source.
So we're in this awkward situation when analytics come to basketball and we've got people having it thrust upon them not in some structured educational setting but just as it comes. People bring up objections which are GREAT philosophy of science questions, which is cool, but then it gets weird, because others in the room have the answers at the ready, and those posing the questions often don't really listen to the answer.
And with all of this, someone like me comes off as arrogant. How dare I say I have the answer here? I'm just some pseudonymous avatar, what right do I have to have "the" answer? But of course from my perspective, I'm not an avatar. I'm me. The same me who is used to being the guy in the actual brick & mortar room and giving people answers every day. I'm not considered arrogant when I do it there, but y'know, I could be. There's a huge presumption to it.
In the end its ethos/pathos/logos stuff. In person, in the places where people normally see me, I typically don't have to work hard to convince them I know stuff. Out in cyberspace, clearly it can be much harder.