RealGM 2023 Top 100 Project - #6 (Hakeem Olajuwon)

Moderators: Clyde Frazier, Doctor MJ, trex_8063, penbeast0, PaulieWal

lessthanjake
Analyst
Posts: 3,082
And1: 2,826
Joined: Apr 13, 2013

Re: RealGM 2023 Top 100 Project - #6 (Deadline 11:59 PM EST on 7/18/23) 

Post#281 » by lessthanjake » Thu Jul 20, 2023 5:07 pm

OhayoKD wrote:.


I can’t respond to all of that, but the bottom line is that I think you’re not really appreciating that single-year measures are inherently pretty noisy, such that the line for being the clear best in the league in impact in a given time period isn’t constantly being #1 every year. For reference, NBAShotCharts has single-year RAPM for LeBron from 2009-2010 onwards, and LeBron ranked (in chronological order): 1st, 6th, 14th, 1st, 25th, 6th, 3rd, 3rd, 62nd, 32nd, 5th, 7th, 182nd, 16th. Based on the logic you’re using, those single-year outputs would suggest LeBron was very much not an impact king, even in the pre-Steph years. But of course we know that when you take longer time horizons, he’s at or near the top. The individual years are just pretty noisy (as well as subject to real ebbs and flows in other players’ form year to year). Meanwhile, in the last decade (excluding 2019-2020), Steph has been: 7th, 1st, 2nd, 1st, 3rd, 3rd, 17th, 2nd, 37th. Being at the top or close to it as often as Steph has been is what being an “impact king” looks like when you drill down to single years.

And, the fact that Jokic has been incredible the last few years in terms of impact doesn’t really demonstrate much in this discussion IMO. Other players can have their time in the sun in a longer time period that someone else is better overall in. If we were sitting in 2016 and asking who the “impact king” in the last decade was, we’d have said LeBron, even though Steph would’ve been ahead the prior few years in most measures. The same is true of Steph in the last decade, despite Jokic being ahead the last few years. Steph’s impact metrics the last few years haven’t been quite as high as before and Jokic’s impact metrics have been amazing in those years, but the overall picture in the last decade has Steph at the top. And, in any event, the last few years aren’t really my main concern here, since the argument is mostly about what Steph did in the years while LeBron was still in his prime. And in those years (i.e. like 2014-2019), Steph was the clear #1 player overall in impact metrics.
OhayoKD wrote:Lebron contributes more to all the phases of play than Messi does. And he is of course a defensive anchor unlike messi.
Squared2020
Sophomore
Posts: 108
And1: 307
Joined: Feb 18, 2018
 

Re: RealGM 2023 Top 100 Project - #6 (Deadline 11:59 PM EST on 7/18/23) 

Post#282 » by Squared2020 » Fri Jul 21, 2023 4:49 pm

.
Professional History:
2012 - 2017: Consultant for several NBA front offices.
2017 - 2018: Orlando Magic
2018 - 2021: Houston Rockets
2021 - Present: NBA League Office
DraymondGold
Senior
Posts: 625
And1: 809
Joined: May 19, 2022

Re: RealGM 2023 Top 100 Project - #6 (Deadline 11:59 PM EST on 7/18/23) 

Post#283 » by DraymondGold » Fri Jul 21, 2023 5:00 pm

Squared2020 wrote:
iggymcfrack wrote:
Don’t small sample RAPM numbers tend to peak much lower in general? Like if the leaders for a regular season RAPM will be at +7, the leaders for postseason RAPM will be more like +3? I don’t think you can compare fragment data from a 14-25 game sample and expect them to be on remotely the same scale as full season data. I feel like the fact that his post-prime full season data tends to be higher than his peak numbers from small season fragments would tend to support this hypothesis.


This is a great point to make, but it's for a reason that's a little different than you expect.

Smaller samples will yield wildly more varying results, so you should expect much larger RAPM values. Explicitly, you should expect to see larger values for playoffs. This is because many stints only have 1-2 offensive possessions. Which means ratings will be 0,50,100,150,200, etc. So players will stick closely to some linear combination of the 50's. RAPMS for very small samples with the same parameterization as regular season RAPM will have upwards of 15-20 instead of 7.

However, traditional RAPM methods use a hardcoded value to stamp down variance. In Joe Sill's original methodology, I believe that value is 2500. So instead of the 50's as above, small samples will get flattened by a factor of SQRT(2500) to give you values of .5,1,1.5, etc. which means you have to see a lot more possessions before the RAPM value creeps back up to 7.

What I do in my version of RAPM is to give the model more flexibility. The most important flexibility is that I know I have a partial season. So I scale the partial season statistically relative to full seasons after 2000. This process is called "transfer learning." This is how I get values of 5-7 for 20-25% of regular season games.
Thanks for the info! Super interesting stuff :D

The one problem with this, and it's very important for this discussion, is that I am limited to the sampled games. This means, the transfer process effectively assumes the games recorded are representative of the entire season and they are not. Two players that habitually pinged are Karl Malone and Hakeem Olajuwon. For some reason, the sample of games out there are poorer games than what their respective teams have done. To combat this, I place sampled records, along with estimated records from the sample and the true record for the season. Utah is almost always a .500 team or losing team in the samples. Houston is in a similar predicament, but not as bad.

So if you look at Hakeem's numbers in my sample, please look at the associated record. It will help you understand of this is more likely an undervalued RAPM or overvalued RAPM.
If we're trying to get a ball-park estimate of a true full-season value (correcting for team record), do you have any recommendations?

I've been suggesting scaling a player's RAPM up or down based on how much they underperformed in that sample. So for example, if a player's team underperformed by 10% in the sample you have compared to their full-season value, I've been suggesting we curve up the RAPM of players on that team by 10%.

As I understand it, I think this basically assumes everyone underperformed on said team equally (which seems like a more fair assumption than saying X or Y player underperformed specifically, at least without looking a lot closer at all the data). Does this seem like an acceptable curve, if we're trying to get in the right ballpark of their full season value?

Obviously since it's a small sample, the uncertainties is higher and we don't know who specifically underperformed in said sample (at least without a much closer look).

Thanks as always for all your fascinating work!

Furthermore, as I tend to note this frequently: don't take RAPM at face value. It's a great illustrative number that has a lot of flaws. I provide the numbers so they can be used with other contexts such as records, individual stats, team stats, box-plus minus, etc.

Sorry for being long-winded, but I hope this helps in the discussion.
No need to apologize, it's all interesting stuff!

Return to Player Comparisons