Dr Spaceman wrote:Bad Gatorade wrote:.
Great to see you back man, and great to see you haven’t lost a step
Thanks! I always endeavour to post on here more frequently, but never get around to it. Due to the time constraints of life, most of my forum time is spent lurking on my phone, where I'm generally not too keen on contributing (phones aren't generally conducive to extensive responses, and the furious opening of spreadsheet after spreadsheet, which is where many of my posts draw inspiration).
I read your work on the Rockets, and the grandiose proclivity of the Rockets shooting 3s (and Chris Paul) does worry me somewhat. Even though a 3 pointer is worth more than a 2 pointer, the 2 pointer has lesser variance, and a player/team whose scoring is more reliant on the 3 is likely to experience higher highs, but lower lows. And honestly, the Rockets are stacked enough offensively that I do wish CP3 would defer to his mid range a bit more frequently, because it gives me a bit more playoff confidence. After all, in a best of 7 series, two games with ORTGs of 110 and 125 might average out to a lower result than 100 and 140, but probably results in a higher chance of winning a series.
That being said, I think the Rockets do have a great team this year. Hopefully, the variable tendency of the 3 point shot works in our favour in the postseason (i.e. it peaks against the Warriors and in the finals).
Outside wrote:It's not my intent to derail the thread by going down the rabbit hole of a detailed RAPM discussion. There's a Statistical Analysis forum for that, and apparently just mentioning Engelmann sparked some strong feelings (btw, I didn't intend to imply that he invented RAPM or concepts behind it, just that his name was included on sites I visited as the guy currently most associated with it).
My intent is to figure out RAPM's value for the POY discussion, particularly for a non-data analyst, and respond from that viewpoint to how it's being used in the discussion.
I'm a long way from understanding RAPM completely, but I feel more comfortable with what it's attempting to do in a general way and how to incorporate it into the discussion.
My usual caveat: I don't know advanced metrics well, I'm just doing the best I can, so I may be wrong about certain things.
After using links provided by others (thank you) and looking around on my own, I've come to a few conclusions regarding RAPM as it pertains to the recent POY discussion.
1. RAPM attempts to identify a player's impact on a per-possession basis independent of box score and basic plus-minus stats. It provides separate RAPM "scores" for offense and defense, plus a total RPM score that combines the RPM scores for offense and defense. This points out limitations of RAPM for POY assessment:
-- Since it is a per-possession metric, the total impact of the player requires taking playing time into account. Some RAPM systems calculate this as "wins" or something similar, but other systems don't calculate this and only provide the per-possession RAPM.
-- Quantifying a player's impact is valuable information, but since it is independent of box score and plus-minus data, it only paints a partial picture. Rather than use RAPM instead of box score and plus-minus data, they should be used together.
2. RAPM requires a lot of data to be usable, so one variation is multi-year RAPM, which uses data from multiple seasons to determine a player's RAPM. This points out additional limitations of RAPM for POY assessment:
-- By definition, multi-year RAPM uses data from multiple seasons and wouldn't be useful for POY assessment, which is about performance for the current season only.
-- RAPM for the current season only is not statistically useful until the end of the season or close to it, so it is of limited value or at least should be considered preliminary until then. Some people consider one season of data inadequate; I've seen mentioned multiple times that three years of data should be the baseline, particularly for elite players. If that's true, then RAPM should be used with caution for POY assessment.
3. Another variation is xRAPM, which combines RAPM with box score and plus-minus data. ESPN's RPM is apparently an xRAPM system. This is an attempt to be an all-in-one stat, but some people prefer to keep RAPM, box score, and plus-minus separate.
RAPM can certainly be part of the data used to assess players, but it should be used in conjunction with other data, not as the sole or even primary means of assessing the performance of POY candidates in the current season. If someone is going to use RAPM to support an argument, it would be extremely helpful to know what type it is (RAPM vs xRAPM, single season vs multi-season).
For POY purposes, it seems like the only useful type is single season, and single season RAPM apparently should have an asterisk by it since it requires multiple seasons of data to properly assess the RAPM for a player.
To illustrate that point, I noticed these oddities on the single-season RAPM that E-Balla provided a link to (
https://docs.google.com/spreadsheets/d/e/2PACX-1vSzp3G5rwP9xgCgluVGmR3Qj4-BMoGSYiuTKM6o_pzES6s95oQE1nQvB2CXed-4fRc_MMGgpULtDaJ_/pubhtml?gid=1825430955&single=true):
-- OG Anunoby, a rookie who plays 21 MPG, scores 6 PPG, and has a 12.0 USG% is 9th in ORAPM, 25th in DRAPM, and 4th in overall RAPM. He must set a heck of a screen. Similarly, others in the top 10 are Yogi Farrell (5th), Robert Covington (8th), and Tyus Jones (9th).
-- RAPM is supposed to be better are assessing defensive impact than box score and plus-minus stats, but Trevor Ariza is ranked 357th in DRAPM, Clint Capela is 153rd, Patrick Beverly is 217th, Paul George is 277th, Andre Drummond is 376th, and Kevin Durant is 457th. I know Durant is overhyped defensively, but I'd say he's better than Isaiah Thomas (427th).
-- POY candidates are scattered throughout the list -- Curry (1st), Butler (7th), Giannis (11th), Westbrook (20th), Chris Paul (32nd), Durant (38th), Anthony Davis (47th), Harden (48th), Kyrie (58th), DeRozan (65th), and LeBron (124th).
Some of this could be corrected by adjusting for minutes or number of possessions played, but that won't suddenly change the list so that the POY candidates all move to the top.
Honestly, derail away. Understanding RAPM isn't entirely orthogonal to the POY discussion. Everybody has different perspectives, and helping enhance these perspectives amongst the POY enthusiasts here may just lead to a stronger understanding of the POY race, what is making each of these players tick, and just how far they're moving the needle.
Firstly, I think your understanding of RAPM and its strengths/weaknesses is pretty good. Based on this post, and your previous one, you've done your research and your knowledge seems greatly enhanced.
Just wanted to point out a couple of things here -
The offensive/defensive split - it's highly important to note that although RAPM is divided between offence and defence, the actual nature of basketball isn't as segregated. To an extent, offence fuels defence, and defence fuels offence. When observing offensive and defensive splits, we tend to make assessments based on offence and defence separately, and this can lead to a hasty discounting of one (or both) of these values, or result in us, say, valuing player A over player B because of a higher offensive RAPM, without necessarily analysing the cause.
I'll use my favourite player (Chris Paul) as an example. Over the years, Chris Paul has conjured up some incredibly high DRAPM scores - in 2016, he ranked 13th in the entire league in DRAPM, and last year, 4th in single year RAPM. It's quite easy to discount this as inflation/random variation, since a 6' point guard shouldn't have this impact on defence, right? These values COULD very well be inflated (after all, he's not the 4th best defensive player in the league - even I can admit this). BUT, it's worth considering that CP3's style of offence is highly calculated. He
detests turnovers, and would prefer to not make the "risky" pass, but rather, the "mathematically safe" pass.
Where this ties into defence, is that I fully believe that some of his defensive impact scores are tied to how he plays offence. His incredible, GOAT like ability to avoid turnovers gives his opponents fewer fast break opportunities, and with the sheer PPP value of fast break opportunities, his offence is literally a form of defence. Furthermore, a jump shot heavy offence might not create equally proficient scoring opportunities, but arguably places players in a better position to start defending if the opponent gets the rebound.
Dirk is another example of this - his defensive RAPM scores exceed that of his defensive reputation, and it's a combination of his ability to avoid turnovers + his proclivity to take jump shots that help aid his scores beyond our general perception.
I'm not at all saying that a certain style of offence is better than another, but rather, a certain style of offence may distort the ORAPM and DRAPM split, and it's worth considering whether or not these factors are at play when choosing to scrutinise the RAPM results that are churned out.
Using multi-year data - I noticed that you have said that for POY purposes, your perception is that single year RAPM is the only useful RAPM. And what you've said is partially true - multi-year data contains data from other years, which is directly counter-intuitive to assessing a single year, right?
Multiple years of RAPM data (whether it's multi-year, or lots of single year data), in my opinion, actually has a place in single year evaluation, as odd as it may sound. And I link this to the notion that single year RAPM does not have a large enough sample size to effectively parse out the random variation that underlies the results, and can provide additional information on whether or not the results effectively represent player quality.
For example, consider the 2014-15 Houston Rockets. On defence, they were ranked 8th with a -2.2 relative DRTG score. However, it's a stark departure from both 2013-14 (-0.4, with a healthy Howard) and our very shameful 2015-16 (+1.7). Even with an entire year's worth of data, the DRTG score wasn't effectively able to parse out the fact that Houston were insanely lucky in 3 point defence that year. They were over 1% better than any other team in terms of defensive FG%, even though they didn't actually DEFEND the 3 that well. In terms of proportions, they were below average in both Wide Open and Open 3s, and ranked 26th here (aka 5th worse in the league). However, they were 1st in the league by a full percentage point (33.3%, with the Bulls second with 34.3%, league average 36.3%) in the percentage that their opponents shot on open 3s. Of their -2.2 DRTG, around -1.75 of this is literally due to how poorly their opponents happened to shoot from OPEN 3s (so we can't even pin this on length).
Now, what I'm getting at here, is if an entire TEAM sample is still capable of such variation, imagine just how much variation can occur with lineups, where even the most highly played players are often playing only 70-75% of a season? Imagine how much variation when collinearity is present? This is one area where multi-year data can help - it helps reduce this sort of random variation. Or heck, even look at the variation in defensive free throw percentages!
Now, I firmly believe that RAPM is best used alongside tracking data. If a player is a +1 on defence in a "fluky" year, and a -1 on defence in a "bad luck" year, the tracking data can help actually showcase whether or not a player is still doing the right things on defence. If a player literally has the SAME rate of deflections, contested shots etc but their RAPM plummets, I don't think it's right to impugn them on a single year basis. However, tracking alone can't really make the assessments that RAPM can, because, well, tracking is still in its infancy, and can't capture everything we want. So, tracking can miss things that RAPM can capture, but RAPM can also capture some incorrect things that tracking can help weed out.
Furthermore, I'd consider multiple years of RAPM data are useful when parsing out coaching effects - a player is going to look a lot worse defensively if he's being coached by Jason Kidd rather than by Gregg Popovich, right? A player switching teams might be playing the same level of defence, but without the proper coaching, he'll look a lot worse than he should, even if it's not his fault.
Or heck, multiple years of data can help shed light on collinearity issues. I touched on these in my other post, but basically, a single year's worth of RAPM might come up with some bizarre results. After all, in single-year RAPM, Draymond Green was essentially equal 1st in ORAPM with LeBron back in 2015-16 (and ahead of Curry). I think Green has a lot of offensive worth, but these results are a clear departure from what he normally produces, and what one would anticipate from his skillset. And no matter how highly one thinks of Green, they'd most certainly struggle to rank him above Curry that season offensively, given how awe inspiring Curry was that year. The impact data from other seasons helps us comprehend what Green was doing in 2015-16, because it teaches us about what his skillset is typically producing, and helps us assess whether or not it can rationally produce the impact that single-year RAPM is showcasing.
Or heck, even Harden this past year - Harden is 5th in single year ORAPM, but Eric Gordon is 4th, and Ariza ranks rather higher than one would expect too. I don't think it's unreasonable that Harden (who has, on multiple occasions, led the league in single year RAPM) might have some of him ORAPM results usurped by Gordon/Ariza (who aren't bad offensive players), especially when Gordon/Ariza had noticeably lower ORAPM results last year (and in general, don't really jibe with the 4th best/21st best offensive players in the league status).
I'm blabbering yet again, but I hope that some of my thoughts, even if you don't agree with them, help you form your own understanding of RAPM and its worth, and how it can be used in assessments.