What Are Your Personal Opinions On RAPM?

sp6r=underrated · Post #81 » by **sp6r=underrated** » Sun May 31, 2015 10:58 pm

Dr Spaceman wrote:It's the best stat currently in existence.

Some people will scoff at this, but just think on this: what are we actually trying to evaluate when we look at a basketball player? Answer: How well does he help his team win? (Can be restated as: what effect does he have on the scoring margin?) The reason I hold RAPM in such high regard is because it is literally the only stat that actually attempts to answer this question. Any box score stat you can think of doesn't even try. Quite literally, RAPM is the only stat that has any validity for what people are actually looking for in a stat, even if they don't quite realize it.

There's nothing inherently wrong with using the box score, as long as you realize that it is, at best, a proxy for what you actually want to know. There can be great players who score 20+ ppg, and terrible ones who do the same. But there will never ever, by definition, be a terrible player who makes a hugely positive impact on his team.

Once I came to realize this, I became a big RAPM convert, and I live with the flaws because it's the only thing that can actually tell me what I want to know. It might miss the mark by more than the box score will, but at least the mark in this case is clear, and it's exactly what I want it to be.

So really this is the whole reliability vs. validity issue. People dislike RAPM generally because it challenges what they thought they knew. The fallacy is in thinking that you ever knew anything by looking at the box score anyway. So yeah, in small samples we have some results that are absolutely nuts. Granted. But with big samples, and enough noise correction, we zero in on exactly what we really want to know. That's beautiful, and it's something no other stat in existence can accomplish in the slightest.

Now I'll add the caveat that RAPM never thinks for me. I would never use it to rank players, or the crux of an argument, or anything of the sort. But as you guys know, I watch a ton of film, and generally find that what I see lines up with what plus/minus data shows. Obviously Manu Ginobili isn't the best player in the league, but when he comes in the Spurs play really well, and that's what RAPM tells us. Now Kevin Durant has a lower RAPM, but a much huger role and more minutes, so quite obviously he's the better player. I think sometimes people who use RAPM get pisgeonholed into using it as the be-all-end-all, and that's not what I do at all.

Final thing: RAPM is at a crossroads right now, and it's either going to head into RPM (wrong direction) or PTPM (right direction). This has everything to do with how I feel about the box score.

It must be comforting to believe that critics of the strong RAPM hypothesis oppose it because it “challenges what they thought they knew” or because people are unable to “realize it” contains the truth that they are looking for. It must be bothersome to believe that NBA teams are run by idiots who are leaving titles on the table with their box score fetish. That is shown by the fact box score stats still impact player salaries.

I am going to quote and then summarize the critical portions of this post:

what are we actually trying to evaluate when we look at a basketball player? Answer: How well does he help his team win? . . . with big samples, and enough noise correction, we zero in on exactly what we really want to know. . . RAPM never thinks for me. I would never use it to rank players . . . as you guys know, I watch a ton of film . . . RAPM is at a crossroads right now, and it's either going to head into RPM (wrong direction) or PTPM (right direction). This has everything to do with how I feel about the box score.

The argument being made here with regards to player evaluation is as follows:

I) on/off stats with a sufficient sample size can “zero in” on exactly how much a player helps his team win
ii) Incorporating a box score stat into on/off stats is the addition of arsenic into a cake.
iii) One should still watch games to determine player value while using on/off stats.

The distinction between the box score and usage of game footage collapses upon careful examination.

The box score is best understood as the first attempt to record events that occur on the court. The original box score was limited to such matters as points, rebounds, assists, etc. It did not contain until the seventies many pieces of information that we take for granted today such as:

Offensive Rebounds
Defensive Rebounds
Turnovers
Steals
Blocks

Box score stats are counting stats.

The video tracking stats are an attempt to record additional information that was not included in the current box score. The video tracking stats record such information as player shooting percentages on various spots on the court or their effectiveness at contesting shots. Video tracking stats are counting stats.

While it is not currently recognized they belong to the same category of stats as the traditional box score eventually it will occur.

The only important difference than between video tracking stats and traditional box score stats is that the new video tracking stats are generally automated while the traditional box score stats are recorded by human beings.

To incorporate the video tracking stats in your analysis but to discard the traditional box score means either (i) the original box score categories have no value but the video tracking stats categories do or (ii) the human error is so substantial that it cannot be trusted. The first argument hits me as bizarre. If you actually hold that view I would welcome an explanation. The second argument is more tenable but still weak.

The NBA during its only days was a minor league. The professionalism of the traditional box score keepers could be questioned. That really isn’t the case anymore due to greater scrutiny from media and millions of fans around the globe. While occasionally there will be mistakes in the box score there are no regular, extreme errors in recent years that justify not trusting the information in the box score.

When you utilize game footage in the evaluation of players it is likely that you are watching the game to make records of actions taken by that player on the court to illuminate what it is they do and whether it has value. If you are more ambitious and have the time you may begin keeping detailed records of what occurs on the court. ElGee did as part of his Opportunities Created stat. I have attached a webarchive link to an old Opportunities Created stat.

http://web.archive.org/web/201111262117 ... ted-value/

If you notice while there is a difference between what is being recorded it is still the counting of events that occur on the court. As an image it looks the traditional box score which makes sense because it belongs to the same category.

The scouting of players, whether in person or through footage, is no different at its core than the traditional box score. When you utilize this information you are not doing something different than the box score but rather you are attempting to count different pieces of information. Thus your hostility to RPM cannot be square with your view that you would never let RAPM do your thinking for you.

In conclusion, you should not use game footage when determining player values with a sufficient on/off sample size if you believe on/off stats can zero in on player value. If you don’t believe on/off stats can zero in on player value you should have no problem with the incorporation of box score stats or other counting stats into on/off stats provided that they can be shown to have value. Hybrid stats actually out performed pure on/off stats in predicting team performance. Until that is no longer the case, which I suspect will be never, the box score should be utilized by intelligent followers of the game. The best method of determining player value is using the best performing hybrid stats, along with whatever information you can discern from watching games.

Warspite · Post #82 » by **Warspite** » Sun May 31, 2015 11:14 pm

Just like someone told me that Bill Russell was a bad defender because he didn't block any shots or get any steals I have contempt for any stat that doesn't translate into the 20th century.

Post #83 » by **tsherkin** » Sun May 31, 2015 11:25 pm

Not a huge fan. It requires a fair amount of constraints on it to help eliminate outliers, it's not super-reliable for single-season stuff so much as batches of seasons, and in general, I find that single metrics aren't that awesome and very vulnerable to various factors.

That said, I'm willing to look at it as a part of a greater whole, a cross-section of statistical relevance. Plus/Minus and other stats of that sort have their value, I just haven't really settled on how much I value what they say, particularly when they contrast with conventional metrics and the eye test. Definitely something not to ignore, though; if you're looking at stuff and all of a sudden RAPM pops in and is like "Hey, everything you're seeing is totally wrong," then it behooves you to get into things a little deeper.

Not just scoring efficiency, not just per-game averages (or per-possession, whatever). How I approach it right now is that if it's REALLY weird in contrast with everything else, then it prompts me to dig deeper. That's about it. I've never felt that single metrics were the way to go with analysis, but that broadening out further made the most sense. Basketball is too complicated to reduce to a single number, and with the proliferation of big data, I find that the opposite direction is so much more insightful and rewarding as far as analysis goes.

I think Spaceman's post is a good one: it's not a ranking stat, but it can do much to tell us about what's happening within the constraints of context. On this team, this guy comes in and good/bad/mediocre things happen. Well, that's a start, and frequently, RAPM does a decent job of showing us who we already think are the better players once you sort for games/minutes/usage/era/whatever you happen to be examining.

I'm a big believer in video analysis, but in drawing things out of it, not just watching it. I like how Synergy was trying to categorize play types and efficiency therein; I love what NBA.com/stats is doing for us, breaking down efficacy by range, defender distance, quarter, time on shot clock and so on and so forth. These are all valuable categories that mix counting stats and eye test stuff (and that's something sp6r was getting into above).

Warspite wrote:Just like someone told me that Bill Russell was a bad defender because he didn't block any shots or get any steals I have contempt for any stat that doesn't translate into the 20th century.

This isn't really a viable position at this stage of the sport. What you really should be saying is that stats based on this data should only be used to compare players for whom the same data is available. There is nothing inherently negative about a statistic which is built on superior data, simply because the prehistory of the sport didn't include that level of attention to detail. There was a lot about the earlier eras which differed, and the sport has grown and improved in many ways since then. There will always be inherent difficulty comparing backwards with players such as Russell, who exercised greatness in as different time and era, because the nature of the game wasn't the same, the strategies of the day differed and yes, the data was much, much poorer. For a while, they didn't even give us proper FG/FGA or FT/FTA and everything, it was PTS and FG and TRB, you know? That doesn't mean punishing a stat that includes offensive rebounds and turnovers makes any sense at all, though.

Post #84 » by **bondom34** » Mon Jun 1, 2015 12:23 am

sp6r=underrated wrote:
In conclusion, you should not use game footage when determining player values with a sufficient on/off sample size if you believe on/off stats can zero in on player value. If you don’t believe on/off stats can zero in on player value you should have no problem with the incorporation of box score stats or other counting stats into on/off stats provided that they can be shown to have value. Hybrid stats actually out performed pure on/off stats in predicting team performance. Until that is no longer the case, which I suspect will be never, the box score should be utilized by intelligent followers of the game. The best method of determining player value is using the best performing hybrid stats, along with whatever information you can discern from watching games.

Really well thought out post, just wanted to say I think the last line here is the winner. If you're using RAPM as a "ranking" stat, I don't think it is really being used correctly. It can be used when incorporated w/ a boxscore and watching video and live games. Despite any attempts, I don't see any way a single metric can be used like some people are seemingly trying to use RAPM today, and as sp6r said here, the simple fact that boxscore stats do to an extent indicate player value says something in that every GM is doing it, and every GM isn't dumb for it.

Post #85 » by **tsherkin** » Mon Jun 1, 2015 12:57 am

Stats of any sort are data points, no more and no less.

So let's say you see 20 ppg or 25 ppg in a guy's box score or player page or whatever. It tells you something. It doesn't tell you enough on its own, but it does demonstrate something about what he's doing on the court.

Box score stats aren't now nor have they ever been useless or the problem, but methods of employing them are a different story. Minutes, shooting volume/usage, teammates, team offensive performance while he's on, turnover rate and attached O like O-boards and playmaking are all relevant considerations as well.

This is what I was getting after with broadening out versus narrowing down. Single metrics try but IMO will always fail to fully account for context and aren't super informative past being rudimentary filters or good highlights to investigate further.

As several others have said, you have to mix in a little of everything for best results.

RayBan-Sematra · Post #86 » by **RayBan-Sematra** » Mon Jun 1, 2015 4:29 am

tsherkin wrote:Stats of any sort are data points, no more and no less.

So let's say you see 20 ppg or 25 ppg in a guy's box score or player page or whatever. It tells you something. It doesn't tell you enough on its own, but it does demonstrate something about what he's doing on the court.

Box score stats aren't now nor have they ever been useless or the problem, but methods of employing them are a different story. Minutes, shooting volume/usage, teammates, team offensive performance while he's on, turnover rate and attached O like O-boards and playmaking are all relevant considerations as well.

This is what I was getting after with broadening out versus narrowing down. Single metrics try but IMO will always fail to fully account for context and aren't super informative past being rudimentary filters or good highlights to investigate further.

As several others have said, you have to mix in a little of everything for best results.

Very well put.
I agree.

Dr Spaceman · Post #87 » by **Dr Spaceman** » Mon Jun 1, 2015 2:06 pm

sp6r=underrated wrote:It must be comforting to believe that critics of the strong RAPM hypothesis oppose it because it “challenges what they thought they knew” or because people are unable to “realize it” contains the truth that they are looking for. It must be bothersome to believe that NBA teams are run by idiots who are leaving titles on the table with their box score fetish. That is shown by the fact box score stats still impact player salaries.

So a lot of people have tripped at the language used in that post, which I realize can come off as very condescending and elitist. That wasn't my intention, but the post is what it is. Had I known it was going to come off the way it did, I wouldn't have used some of those specific terms. So, sorry if you're offended by that.

Your point about GMs, well, their purpose is very different from mine. Their evals are projective in nature, "how will this guy help my team next year and how much $ should I spend on him", while mine are more explanatory "who was the best player from 2004-2007". There's some overlap, sure, but generally GMs interested in future performance don't have the luxury of years worth of data to work with.

RAPM is extremely bad at out-of-sample prediction, no matter how you look at it. If you're a GM, it's the wrong tool for the job, no doubt about it. Box score stats certainly predict future box score stats better than RAPM predicts future RAPM. But RAPM is noted for its exceptional explanatory power, and given that that's what most here are interested in, it seems like a good tool to use in this case. Huge samples and years of data give very good results in RAPM, but without these it has definite reliability issues. That is a reason not to use small samples, not a reason to disparage the stat itself.

Spoiler:

The argument being made here with regards to player evaluation is as follows:

I) on/off stats with a sufficient sample size can “zero in” on exactly how much a player helps his team win
ii) Incorporating a box score stat into on/off stats is the addition of arsenic into a cake.
iii) One should still watch games to determine player value while using on/off stats.

The distinction between the box score and usage of game footage collapses upon careful examination.

The box score is best understood as the first attempt to record events that occur on the court. The original box score was limited to such matters as points, rebounds, assists, etc. It did not contain until the seventies many pieces of information that we take for granted today such as:

Offensive Rebounds
Defensive Rebounds
Turnovers
Steals
Blocks

Box score stats are counting stats.

The video tracking stats are an attempt to record additional information that was not included in the current box score. The video tracking stats record such information as player shooting percentages on various spots on the court or their effectiveness at contesting shots. Video tracking stats are counting stats.

While it is not currently recognized they belong to the same category of stats as the traditional box score eventually it will occur.

The only important difference than between video tracking stats and traditional box score stats is that the new video tracking stats are generally automated while the traditional box score stats are recorded by human beings.

To incorporate the video tracking stats in your analysis but to discard the traditional box score means either (i) the original box score categories have no value but the video tracking stats categories do or (ii) the human error is so substantial that it cannot be trusted. The first argument hits me as bizarre. If you actually hold that view I would welcome an explanation. The second argument is more tenable but still weak.

The NBA during its only days was a minor league. The professionalism of the traditional box score keepers could be questioned. That really isn’t the case anymore due to greater scrutiny from media and millions of fans around the globe. While occasionally there will be mistakes in the box score there are no regular, extreme errors in recent years that justify not trusting the information in the box score.

When you utilize game footage in the evaluation of players it is likely that you are watching the game to make records of actions taken by that player on the court to illuminate what it is they do and whether it has value. If you are more ambitious and have the time you may begin keeping detailed records of what occurs on the court. ElGee did as part of his Opportunities Created stat. I have attached a webarchive link to an old Opportunities Created stat.

http://web.archive.org/web/201111262117 ... ted-value/

If you notice while there is a difference between what is being recorded it is still the counting of events that occur on the court. As an image it looks the traditional box score which makes sense because it belongs to the same category.

The scouting of players, whether in person or through footage, is no different at its core than the traditional box score. When you utilize this information you are not doing something different than the box score but rather you are attempting to count different pieces of information. Thus your hostility to RPM cannot be square with your view that you would never let RAPM do your thinking for you.

In conclusion, you should not use game footage when determining player values with a sufficient on/off sample size if you believe on/off stats can zero in on player value. If you don’t believe on/off stats can zero in on player value you should have no problem with the incorporation of box score stats or other counting stats into on/off stats provided that they can be shown to have value. Hybrid stats actually out performed pure on/off stats in predicting team performance. Until that is no longer the case, which I suspect will be never, the box score should be utilized by intelligent followers of the game. The best method of determining player value is using the best performing hybrid stats, along with whatever information you can discern from watching games.

Okay, before I dig in to some specifics, I want to make a general statement about your argument: You are assuming all measurements taken with the same process are equally valuable, as if the approach or intention is the important part. Specifically, this sentence:

The only important difference than between video tracking stats and traditional box score stats is that the new video tracking stats are generally automated while the traditional box score stats are recorded by human beings.

No. The most important difference between the video tracking stats and the box score stats are that they measure different things. To use an extreme example, if I decided to start tracking the number of beads of sweat on a player to gauge performance, would you accept that because the process is the same as the traditional box score stat? Of course not, because beads of sweat is not a valid measure of player performance.

Looking at the traditional box score can give you some good information, yes. But I'd compare it to a sledgehammer where the video tracking stats are a toffee hammer. One is a blunt instrument for covering as much ground as possible, the other is an efficient tool for a specific purpose. There are merits to both approaches, but they have different goals and achieve different results. A player's True Shooting % cannot be equated to his, say, Catch & Shoot 3 point percentage or 3 point percentage with a defender within 2 feet. One of those things gives you a general overview of performance across a wide variety of situations, and the other is a specific condition that allows you to make precise inferences.

I get that your general objection is to me rejecting RPM while singling out PTPM as something great. Let me be clear: I don't have an issue with RAPM being "corrupted" or anything like that. I have no illusions about it being perfect. Where I take issue with RPM specifically (and xRAPM and SPM and whatever else) is that it introduces a specific bias, and that bias very obviously presents itself in the results.

Building on the previous paragraph, a big reason to use RAPM is that it's orthogonal to the box score, and thus can capture value that the box score can't "see". A guy I've spent a lot of time discussing on this forum is Kyle Korver, who is the poster boy currently for non-box score impact. He's important to discuss here, because "pure" RAPM makes him look way, way better than RPM does. The discrepancy exists entirely because he's a 12 ppg scorer, but my problem is that if you've accepted Korver as an elite impact player, you've done so in spite of the box score saying he's barely an above average role player. So what purpose does adding that box score information serve? The results become far more reliable on average through the whole population, but in specific cases it can erase something that should be there.

More concisely: RAPM was intended to define "impact" as something separate (but not incompatible with) the box score, so what purpose would adding box score information serve?

The reason PTPM is so attractive to me is that it is aware of these non box score contributions and can account for them. In the Korver example, it can incorporate "gravity score" to indicate how defenses bend to accommodate his shooting. It can include at-rim FG%, which is very obviously a better measure of rim protecting prowess than blocks. It doesn't introduce bias that it was already trying to correct for.

I'm not interested in RAPM for its out-of-sample prediction prowess, I'm interested in it for what it tells me. That the PTPM is way, way, way better at prediction that xRAPM is great, but not the main determinant for my usage.

Also I think we just have different definitions of the term "zero in". I'm not saying we ever get close to knowing anything for certain, I'm saying we can draw tighter and tighter circles around the target.

The scouting of players, whether in person or through footage, is no different at its core than the traditional box score. When you utilize this information you are not doing something different than the box score but rather you are attempting to count different pieces of information. Thus your hostility to RPM cannot be square with your view that you would never let RAPM do your thinking for you.

100%, this follows logically. But it's not a black-and white issue; you're making it as if the only two choices here are "RAPM thinks for you" vs. RAPM does not think for you, and I'm kind of damned either way, you know? I don't think this tight deductive logic really holds when I can just say "RAPM influences my thinking, but is not the whole of my thinking". That puts me in a much better light, while also getting the point you wanted to make across.

What Are Your Personal Opinions On RAPM?

Re: What Are Your Personal Opinions On RAPM?

Re: What Are Your Personal Opinions On RAPM?

Re: What Are Your Personal Opinions On RAPM?

Re: What Are Your Personal Opinions On RAPM?

Re: What Are Your Personal Opinions On RAPM?

Re: What Are Your Personal Opinions On RAPM?

Re: What Are Your Personal Opinions On RAPM?

Re: What Are Your Personal Opinions On RAPM?

Re: What Are Your Personal Opinions On RAPM?

Re: What Are Your Personal Opinions On RAPM?

Re: What Are Your Personal Opinions On RAPM?

Re: What Are Your Personal Opinions On RAPM?

Re: What Are Your Personal Opinions On RAPM?

Re: What Are Your Personal Opinions On RAPM?