Top 25 peaks of the 2001-25: #13-#14 Spots

Moderators: Doctor MJ, trex_8063, penbeast0, PaulieWal, Clyde Frazier

User avatar
homecourtloss
RealGM
Posts: 11,513
And1: 18,902
Joined: Dec 29, 2012

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#81 » by homecourtloss » Fri Oct 10, 2025 12:18 am

lessthanjake wrote:
homecourtloss wrote:
lessthanjake wrote:
Single-playoff EPM is definitely more reliable than single-playoff RAPM (or single-playoff on-off), since the whole point of the measure is to use box and tracking data to make the measure much more stable in small samples. That said, playoff EPM is definitely not as reliable as regular-season EPM (or multi-season RAPM or other similar measures), because the sample-size issues are still more of a problem. I’m not even sure that playoff EPM is better than pure box data like playoff BPM. If it is, it’d probably be more because it incorporates tracking data than because the impact component improves accuracy in that small a sample. If there were somehow a much more reliable form of single-playoff data than what we have, then I’d definitely use that instead. But there’s not, because the playoffs are just a small sample and there’s not any way around that. I do think that measures incorporating box data are better for the playoffs than raw impact data, because box data stabilizes things a lot. And I’ll note that this is not unique to the playoffs. The same is true if we wanted to compare these sorts of hybrid measures in a single season compared to one-year RAPM. One-year RAPM is a worse measure than that.

There’s a certain sample size at which RAPM becomes better than box or hybrid data. I’m not sure exactly where that line is, but I don’t think playoff data really gets us over that hump. It certainly doesn’t in single-year or even several-year samples. But I’ll note that unfortunately this is probably true even for career playoff RAPM of players that have high playoff minutes, because the adjustments the measure is making for everyone else are based on overly small samples for most of those other players. If most of the adjustments are just driven by a ton of noise, then playoff RAPM isn’t even particularly better than playoff on-off. After all, the value of RAPM over on-off is the fact that it accounts for everyone else on the court. So if it has only very small samples with which to do that for those players, then any delta between playoff RAPM and playoff on-off is probably going to be largely consumed by noise.

Of course, one might try to say that career playoff RAPM is at least as reliable as something like single-playoff EPM. As an initial matter, I’m actually not sure if that’s true. It would depend on whether the larger sample for career playoffs stabilizes the measure better than the addition of box and tracking data does for EPM. I’m not sure on that. But it’s plausible to me that it could be as reliable or more reliable. The problem for these purposes is that we’re assessing single-year peaks, so career playoff RAPM is not specific to the issue while playoff EPM at least is. A measure that isn’t all that reliable but actually does go to the precise question at hand is certainly more useful than a measure that isn’t all that reliable and doesn’t even go to the precise question at hand.


You’re rejecting multi-year playoff RAPM because of “small samples,” but you’re fine using single-playoff EPM, which literally relies on RAPM-derived priors and even smaller data windows.


You’re ignoring the fact that EPM is stabilized by the use of box and tracking data. Using other data to stabilize small-sample impact data is the whole point of the measure!

But yeah, as I’ve already said, playoff EPM isn’t super reliable either. But this is a single-year peaks project and so naturally people will look at single-playoff data. And, despite not being a very reliable measure, playoff EPM is one of the best measures we have for that purpose. Career playoff RAPM is not only not a very reliable measure but also is not even directed to the relevant question at hand.

What makes this even more inconsistent is that you also accept single playoff season plus/minus data, and even single-series on/off splits — both of which are far noisier than a multi-year RAPM sample, wihtout the usual caveats of "worthless," etc.


I can’t even count how often I’ve cautioned people here about relying on single-playoff on-off data. The idea that I do not provide similar caveats about that sort of data is just complete nonsense. I’ve done it many times! As I’ve said before, if you’re going to criticize me for purportedly being inconsistent, you really should at least familiarize yourself with my posting history, so that you aren’t just beating on a straw man.

If single-series or single-season on/off are acceptable in evaluating, then multi-year RAPM which reduces that same small-sample variance through aggregation — should be more reliable, not less. Multi-year RAPM exists to address the small-sample problem you’re citing. Regularization already accounts for low-minute players, and aggregating across seasons reduces variance dramatically.


So I’ll leave aside the fact that I do not think single-series or single-playoff on-off are measures we should put virtually any weight on and that I’ve said as much many times on these forums, so this is largely just straw manning.

Regarding the rest of what you say here, I don’t disagree with it. But you’re really handwaving when you say “regularization already accounts for low-minute players.” With career playoff RAPM, the issue I’m talking about isn’t really the treatment of low-minute bench players. It’s that the vast majority of players haven’t actually played very many playoff games. Which makes the measure much noisier even for players who have played a lot of playoff games, because the measure doesn’t have a large enough sample to accurately control for the other people on the court. To draw an analogy, multi-season RAPM is much more reliable than single-season RAPM not just because the individual player in question has played a lot of games over the course of multiple seasons, but also because virtually everyone that the measure is adjusting for has also played a lot of games over that timespan. What you have with career playoff RAPM is basically a measure that has an okay sized sample for a small number of players (but not a great-sized sample, since we’re generally still talking about only like ~2 season’s worth of games for guys with lots of playoff experience) and a very small sample for everyone else (often less than 1 season’s worth of games). The result is something that is definitely going to be even less reliable than something like 2-year RS RAPM. Given all that, I’m inclined to think it’s probably roughly akin to pure single-season RAPM in reliability, even for guys at the high end in playoff experience. Maybe it’s slightly better than that, but it’s roughly in the same ballpark IMO. And yeah, I don’t really put much value on single-year RAPM, and have certainly caveated/cautioned about the sample-size issue there.


You’re saying that multi-year playoff RAPM is unreliable because “most players don’t have enough playoff minutes,” but that argument misunderstands what regularization and aggregation are doing (you haven't actually done it so this makes sense)

The entire point of multi-year RAPM is to deal with exactly small-sample noise and interdependence between players. Regularization already accounts for players with limited minutes by shrinking noisy estimates toward the mean, and aggregating across years improves the signal-to-noise ratio. In other words, the “problem” you’re identifying is basically the reason multi-year RAPM exists and why it performs better than single-season or single-playoff samples in predictive validity.

What makes this even more confusing is that you actually agree with that logic — you say that multi-season RAPM is more reliable than single-season RAPM precisely because everyone in the regression has more data. That’s exactly how career playoff RAPM works. You’re acknowledging the mechanism that improves reliability and then turning around and saying it somehow doesn’t apply in the playoff context. That’s not a different principle it's the e same statistical process operating on a smaller pool. The only way it “wouldn’t work” is if multi-year aggregation somehow made the data less stable, which is impossible.

By contrast, playoff EPM uses box and tracking data that are themselves unstable in small playoff samples. Those stats (like rim deterrence, defensive matchup data, and shot quality) are highly volatile in small samples — and EPM still relies on RAPM-derived priors to stabilize its impact component. So it’s not an independent fix; it’s standing on the same foundation you’re rejecting, just with fewer games and more noise per possession.

The claim that career playoff RAPM “isn’t directed at the relevant question” because this is a “single-year peaks” project doesn’t hold either. If the goal is to estimate true player impact in playoff environments, it makes no sense to privilege a noisier, less-stable metric over one that’s explicitly designed to filter out randomness. Multi-year data doesn’t dilute peak performance — it clarifies whether what looks like a peak was genuine impact or variance.

As for your point about other players’ small samples increasing noise — that’s precisely why multi-year data helps. You’re correct that RAPM’s reliability depends on everyone’s sample size, not just one player’s. But that’s an argument for aggregation, not against it. By pulling multiple years together, you increase the effective sample not only for the player in question but also for every player he shares the floor with. This stabilizes the entire adjustment matrix.

Meanwhile, you continue to give single-playoff EPM, BPM, and even single-series on/off splits some analytical weight — all of which are drastically smaller samples and carry no structural correction for opponent, teammate, or lineup noise. You can’t simultaneously say multi-year RAPM is too noisy to use and then turn around and cite single-year or single-series data as “one of the best measures we have.” That’s internally inconsistent.

The reliability hierarchy is well-established in the literature:
multi-year RAPM > single-year RAPM ≈ hybrid box/impact (like EPM) > raw BPM/on-off in small samples.

So unless there’s actual empirical evidence showing multi-year playoff RAPM performs worse than single-year EPM in out-of-sample prediction or stability tests, dismissing it as “hand-waving” is just selective skepticism.

Bottom line: you’re critiquing multi-year playoff RAPM for the very problem it was built to solve, explicitly agreeing with the statistical logic behind its reliability, and then rejecting it anyway in favor of smaller-sample alternatives that rely on the same priors. That’s not methodological rigor — that’s just arguing in circles
lessthanjake wrote:Kyrie was extremely impactful without LeBron, and basically had zero impact whatsoever if LeBron was on the court.

lessthanjake wrote: By playing in a way that prevents Kyrie from getting much impact, LeBron ensures that controlling for Kyrie has limited effect…
Doctor MJ
Senior Mod
Senior Mod
Posts: 53,682
And1: 22,631
Joined: Mar 10, 2005
Location: Cali
     

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#82 » by Doctor MJ » Fri Oct 10, 2025 12:40 am

homecourtloss wrote:
Doctor MJ wrote:
Djoker wrote:
I just noticed this post. Two questions/comments.

1) What is the source of this RAPM as in who created it?

2) The confidence intervals have to be absolutely huge here.


I believe this is Englemann, who I generally see as a pretty safe bet for doing RAPM pretty appropriately.

With that said, as I've noted before, I've got major concerns about making use of playoff RAPM in my analysis generally.


It's good data to have especially for the heavy minutes players. These type of career numbers look pretty good when you then talk about single your peaks.

Additionally, I see people using single year playoffs EPM without anybody saying "I've got major concerns..."


I'm glad we have the data too, to be clear, I'm just not as gung-ho about it as one would expect simply based on the fact that I use +/- based stats a lot, and I've elaborated on this some in the past.

Here's the essence of the concern as I see it:

In the regular season, basically all the top players we end up talking about are spending their time winning with plenty of off-court time where we see the rest of the players do worse without them on the court.

In the playoffs, when you lose in the first round, you probably spend your time losing while playing a ton of minutes, which means not only that your Off sample is small, but that it tends to be garbage. A player in these circumstances disproportionately is thus going to tend toward mediocre On-Off & RAPM data even if he's not playing any worse than in the regular season, and that's a problem for using this data as a proxy for a player's general value add.

To put another way: Much of what makes this data valuable in the regular season is that everyone's playing everyone else and roughly has the same general degree of difficulty, which makes apples-to-apples comparisons plausible.

In the playoffs, even for those having deep playoff runs, they aren't playing the same opponents as other players unless we're talking about their teammates.

But then, even with teammates, the sample makes it fragile. So for example, I'd really like to see Englemann or someone do another career playoff RAPM after 2025 in part because we're now in a situation where Curry has a higher +/- in less minutes than Green. If with this data we see Green & Curry continuing to have roughly the same PS RAPM, that's helpful as it gives an indication that there's enough data for that duo that RAPM isn't going to get jostled very much.

On the other hand, if Curry leaps ahead of Green - after being down at 4.9 while Green was at 6.3 - then that's telling us there's still quite a bit of noise even for guys who've played as much as this.

As I say that, I have to say I'd be a bit weirded out by Green having a higher PS RAPM while having a lower On & On-Off, and so while the stability would say good things about RAPM, I'd want to look more into what's going on there. Every time I compare those two guys I want to seriously entertain the idea that Green is the more valuable of the two, but it's easier to understand how RAPM would say that when the more raw stuff seems to tell a similar story.

Anyway, all this is to say that it's not that I don't want to look at this data, but I have a cautiousness here that isn't simply about "small sample" in quite the same way as it's meant with +/- generally.

Re: "I see people". I believe you, but just keep in mind how toxic it is to assert that everyone is the same in the same bad way.

Re: EPM. I'm not a big EPM proponent, but I will say that the non-+/- part of stats like that make it more resilient with smaller sample. Not trying to justify all possible contradictions in a poster's behavior, but these type of concerns are why stats like EPM exist.
Getting ready for the RealGM 100 on the PC Board

Come join the WNBA Board if you're a fan!
Doctor MJ
Senior Mod
Senior Mod
Posts: 53,682
And1: 22,631
Joined: Mar 10, 2005
Location: Cali
     

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#83 » by Doctor MJ » Fri Oct 10, 2025 12:48 am

homecourtloss wrote:What makes this even more confusing is that you actually agree with that logic — you say that multi-season RAPM is more reliable than single-season RAPM precisely because everyone in the regression has more data. That’s exactly how career playoff RAPM works. You’re acknowledging the mechanism that improves reliability and then turning around and saying it somehow doesn’t apply in the playoff context. That’s not a different principle it's the e same statistical process operating on a smaller pool. The only way it “wouldn’t work” is if multi-year aggregation somehow made the data less stable, which is impossible.


So, I appreciated your post generally and want to make sure that's made clear.

I want to make clear that:

a) There is something specifically problematic about treating playoff RAPM as if we should expect all the players in the playoffs to be able to match their RS RAPM simply by playing as well as they normally do.

b) The issue with long sample size in general is that players change. So this is why I recently did studies on both a) career RS RAPM and b) peak 4-year RS RAPM. The former isn't something I'd be looking to use to evaluate peak, while the latter is. The latter isn't perfect either - players can and do change over the course of 4 years - but it does represent a possible "sweet spot" for assessing peak for any player with a stable enough prime.

c) What all this means that while in theory if we just get enough sample we'll have a perfect measure, in practice, none of this stuff is good enough to be the last step in a good analysis imho.
Getting ready for the RealGM 100 on the PC Board

Come join the WNBA Board if you're a fan!
lessthanjake
Analyst
Posts: 3,477
And1: 3,109
Joined: Apr 13, 2013

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#84 » by lessthanjake » Fri Oct 10, 2025 12:58 am

f4p wrote:
sorry man, but it's too noisy after 150-200 career playoff games is a crazy take.


Again, as I’ve said a bunch of times now, the biggest issue here is that the vast majority of players don’t have that many career playoff games. RAPM basically adjusts for the quality of everyone else who was on the floor. If the amount of data it has about those other people is small, then those adjustments will be really noisy. This will make the measure much more noisy than it would be if everyone had that 150-200 game sample you’re talking about. And even when everyone has that kind of sample, it’s still fairly small for RAPM purposes. The playoff samples for basically everyone you’re talking about is very similar to a two-season sample, and 2-year RAPM is still pretty noisy. But again, this is definitely worse than that, because almost no one in the data set actually even has that size sample. (Note: And yes, this same criticism is very much true of Squared’s partial RAPM as well).

like every time i try to use a box score stat on this board, i'm treated like a caveman who doesn't understand fire, but then i try to come over to the other side and suddenly the impact stuff is also useless, even on geological time scales.


So the people who post here aren’t a hive mind. I don’t think I’ve treated you like a caveman who doesn’t understand fire when you’ve cited box data. I cite box data too!

I think the issue you’re coming to, though, is that you’re trying to cite each category of data in the situation where they’re probably less reliable than the other type. Again, in large multi-season samples, impact data is better than box data (though I don’t think it’s better enough to just ignore box data). This is because it is getting at everything that happens on the court, while box data inherently is not. But, in small samples, that benefit is outweighed by the negative of impact data being really noisy (and far noisier than box data) in small samples. If you’ve been trying to rely on box data over multi-season samples and impact data in small playoff samples, then you’re kind of just getting it backwards. And that’s not because people arguing with you are being inconsistent. It’d be because the relative usefulness of different types of data genuinely differs depending on how large the sample is.

All that said, I’m not familiar with exactly what you’ve been criticized for by posters here, so if you were criticized for using playoff box data instead of playoff impact data, or for using single-season box data instead of single-season impact data, then I’d disagree with those criticisms.

but of course, in this particular conversation, kevin durant also beats steph in the box score. so it's curious how the guy winning both the stable and noisy parts of the equation is still apparently behind (and don't say this is a peaks project, because we know the same results have and will hold in the Top 100).


So I don’t really think this is exactly right. The basis for your conclusion that Durant beats Steph in playoff impact is a career playoff RAPM measure. But when it comes to box score, Steph has a higher career playoff BPM! And in their years together, Steph had higher playoff RAPM (TheBasketballDatabase gives us three-year playoff RAPM and Steph is ahead of Durant in 2017-2019). On the flip side, Durant is a bit ahead in playoff BPM in those years (and also ahead in the career playoff RAPM you’ve cited to). So basically, if we look at career playoff data, one is ahead in box and the other is ahead in impact. And if we look specifically at the playoffs they played together, the same is true but it is flipped which one is ahead in what. A measure like EPM aims to put box and RAPM together, and it generally has Steph looking a bit better in the playoffs than Durant, though it’s fairly close overall (albeit with Steph’s peak number being a good deal higher).

If that were the only information we had, I think it’d be reasonable to conclude that Durant was generally similarly good in the playoffs as Steph. I think what gets people going more clearly towards Steph is multiple things. First, larger-sample regular-season data strongly indicates Steph is more impactful than Durant, which obviously carries an implication that there’s a good chance the much-smaller-sample and more-noisy playoff impact data underestimates Steph relative to Durant. Second, a lot of people’s eye test prefers Steph in the playoffs due to the gravity stuff and seeing on film how much he opened up things for Durant. Third, Steph has actually won titles without Durant and the opposite is not true. I know you’d say that that’s just due to differences in circumstance/opponents/etc. And that may be partially right (for instance, maybe Durant would have a title outside Golden State if Harden and Kyrie had been healthy in 2021), but I think people do have a view that Durant has played with a lot of talent outside of the Warriors and hasn’t managed to do as much with it as Steph did without Durant.

I think you can get to a more charitable view of Durant in that comparison if you decide (1) you just think the playoffs is different and so Steph being a much more impactful regular season player is irrelevant to playoff performance; (2) your eye test just doesn’t prefer Steph; and (3) you take a particularly positive view of Steph’s supporting casts, a negative view of Durant’s supporting casts, or the opposite about their playoff opponents, such that you conclude that the difference in success without the other is entirely a result of differences in circumstances. If you combine all three beliefs at once, then that’s fine. But I don’t think most people do. To squarely tie this into discussion that’s relevant to this thread (since Steph has long since been voted in), if someone does think like that, then I think that Durant should probably be on their ballot.

or a favorite like nash looks like harden just worse across the board but the best stat BPM is secretly against him. even though there's literally no mathematical reason given why it's so. just the creator of the stat loves nash and says it underrates him. how that's any different than me or you declaring someone to be underrated by a stat is beyond me.


The reason for thinking BPM underestimates Nash is that his large-sample RAPM is dramatically better than his BPM, and large-sample RAPM is a really good stat, so if there’s a big difference between what large-sample RAPM says and what a box metric says, it makes sense to side with the large-sample RAPM. The fact that the creator of BPM feels the same way about it and specifically included it in the About BPM page just shows that even the guy who created the measure thinks that, when they disagree significantly, large-sample RAPM should be believed over his measure.

And since I’ve already explained this to you, I find the statement you’ve made here a bit frustrating. You know full well that what I’ve said is not akin to “me or you declaring someone to be underrated by a stat,” and you also know that the BPM creator’s statement wasn’t “just the creator of the stat lov[ing] nash and say[ing] it underrates him.” This isn’t just people saying the measure has Nash come in too low without any reasoning beyond that they like Nash. It’s that large-sample RAPM is way higher on Nash than BPM is. That’s the “mathematical reason” to think BPM underrates him. And it’s actually a very good reason.

Not only that but it’s actually essentially impossible to have a view of Nash that is consistent with both large-sample RAPM and BPM, so if your view of him is tethered to BPM then I could just as easily say that you think large-sample RAPM is secretly biased in his favor even though there’s no mathematical reason given by you why that’d be the case. In reality, there’s a reason to hold either view of Nash since there’s data supporting either view. The question is just which data we think is more reliable. I’m generally pretty comfortable with thinking large-sample RAPM is more reliable than BPM. And, in this particular case, I’m *even more* comfortable about that, because my eye test is very high on Nash.
OhayoKD wrote:Lebron contributes more to all the phases of play than Messi does. And he is of course a defensive anchor unlike messi.
f4p
Sixth Man
Posts: 1,921
And1: 1,900
Joined: Sep 19, 2021
 

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#85 » by f4p » Fri Oct 10, 2025 1:06 am

Doctor MJ wrote:Re: "I see people". I believe you, but just keep in mind how toxic it is to assert that everyone is the same in the same bad way.
.


I mean venting frustration with generic comments like these seems better than naming names doesn't it? You're mostly in the in-clique of opinion around here so you won't understand, but while this board has a high level of basketball discussion, it's also a small, insular place (only moreso over time when we compare recent project results to past ones) prone to what feels like a lot of groupthink and at some point you get tired of bashing your head against that wall.
User avatar
homecourtloss
RealGM
Posts: 11,513
And1: 18,902
Joined: Dec 29, 2012

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#86 » by homecourtloss » Fri Oct 10, 2025 2:11 am

lessthanjake wrote:
homecourtloss wrote:
You've cited single playoff EPM numerous times in this project without any of the cautionary "this is basically useless" warnings but you have the most reservations for the multi-year playoffs RAPM. That's called inconsistency to push agendas.


Yeah, instead of proving once again that I was right that you are unable to stop yourself from making rude comments directed at me, you might want to familiarize yourself with my posting history and stop making baseless straw-man attacks. The reality is that I have actually caveated playoff EPM with similar caution, even when it supported my conclusion.

Let’s look at the following, for example, which is a post in which I cited playoff EPM to support my conclusion that Jokic was 2024 POY, but in doing so specifically called out that playoff EPM is reliant on small samples:

https://forums.realgm.com/boards/viewtopic.php?p=113775517#p113775517

Here’s the relevant quote from me: “Playoff EPM bears out Jokic being the best (by a clear margin, though it has Luka slightly above SGA), though obviously that’s reliant on low-sample-size data.”

Here’s another example of me doing a similar thing:

https://forums.realgm.com/boards/viewtopic.php?p=117687856#p117687856

In that post, I provided playoff EPM data but caveated it as follows: “FWIW, while playoff impact data is low-sample-size data that shouldn’t be taken all that seriously even when using box/tracking data to help stabilize it, here’s their playoff EPM in those years.”

It took me all of about 2 minutes to find those posts. You might’ve tried to do the same before you posted rude posts directed at me that were clearly baseless. But you didn’t. And to the extent your criticism is that, even though I have cautioned about playoff EPM when citing to it in the past on these forums, you are upset that I have not specifically done that when I’ve mentioned it in this project, I’d say that that is pretty clearly just a silly criticism. And I’d also say that this is a single-year peaks project, so to some degree it just goes without saying that data relating to a specific playoff run is not super reliable, because that’s just the nature of the beast with single-playoff data but we can’t avoid single-playoff data in this project. The same isn’t really true of career playoff RAPM which I don’t think is very reliable but is also something we can easily avoid using because it is not specific to a given peak year.


I don't want to derail this any further, but now you are trying to excuse what's already been exposed. You have not "caveated" the use of single season EPM the way that you did for multi-year playoffs RAPM even though the latter is more stabilized and more robust.

"though obviously that’s reliant on low-sample-size data" is not anywhere close to what you wrote about multi year RAPM, which you dismissed straight away, while you use noisier single playoff season EPM numbers to evaluate players (even though as I have pointed out, the impact measures from EPM come from RAPM anyway).

lessthanjake wrote:Given how tiny playoff off samples are, I think this is mostly extremely noisy to the point of being not very useful, except perhaps to look at the small number of people with tons of playoff games in this data set (guys like LeBron, Steph, Draymond, Duncan, and Ginobili), and even some of those wouldn’t be much less noisy than single-season RAPM (which is super noisy). And it’s actually probably even worse than just looking at the off minutes would lead you to believe, because, even for those players with lots of playoff minutes, the model will be trying to control for the presence of other players that have a very small amount of playoff data. So, basically, the adjustments the model makes compared to on-off will be mostly based on garbage.
lessthanjake wrote:Kyrie was extremely impactful without LeBron, and basically had zero impact whatsoever if LeBron was on the court.

lessthanjake wrote: By playing in a way that prevents Kyrie from getting much impact, LeBron ensures that controlling for Kyrie has limited effect…
User avatar
Caneman786
Sophomore
Posts: 218
And1: 207
Joined: Dec 27, 2024
 

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#87 » by Caneman786 » Fri Oct 10, 2025 2:17 am

homecourtloss wrote:The reliability hierarchy is well-established in the literature:
multi-year RAPM > single-year RAPM ≈ hybrid box/impact (like EPM) > raw BPM/on-off in small samples.


Hello homecourtloss,

This is an interesting discussion to me. Where can I read more about this hierarchy in the literature and the advantages of RAPM vs EPM?
lessthanjake
Analyst
Posts: 3,477
And1: 3,109
Joined: Apr 13, 2013

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#88 » by lessthanjake » Fri Oct 10, 2025 2:17 am

homecourtloss wrote:
You’re saying that multi-year playoff RAPM is unreliable because “most players don’t have enough playoff minutes,” but that argument misunderstands what regularization and aggregation are doing (you haven't actually done it so this makes sense)

The entire point of multi-year RAPM is to deal with exactly small-sample noise and interdependence between players. Regularization already accounts for players with limited minutes by shrinking noisy estimates toward the mean, and aggregating across years improves the signal-to-noise ratio. In other words, the “problem” you’re identifying is basically the reason multi-year RAPM exists and why it performs better than single-season or single-playoff samples in predictive validity.

What makes this even more confusing is that you actually agree with that logic — you say that multi-season RAPM is more reliable than single-season RAPM precisely because everyone in the regression has more data. That’s exactly how career playoff RAPM works. You’re acknowledging the mechanism that improves reliability and then turning around and saying it somehow doesn’t apply in the playoff context. That’s not a different principle it's the e same statistical process operating on a smaller pool. The only way it “wouldn’t work” is if multi-year aggregation somehow made the data less stable, which is impossible.


This feels like you’re purporting to explain something to me that I already understand, while totally missing the point I was making. You say “That’s not a different principle it’s the same statistical process operating on a smaller pool.” The fact that it’s a smaller pool than multi-season RAPM is exactly the point! Nothing in what I’ve said is suggesting “multi-year aggregation somehow made the data less stable.” Nor am I suggesting that regularization doesn’t shrink noisy estimates towards the mean. What I am saying is that there’s very little data for most players because most players simply haven’t played that many playoff games in their career, so when career playoff RAPM adjusts for others who were on the court in the playoffs, it’s largely adjusting based on noisy data. And this is the case even for players who themselves have actually played a decent number of playoff games. EDIT: I see below that you actually acknowledge that this point is right, so I guess we are in agreement on this.

And, by the way, this is also a significant issue with Squared’s partial RAPM. There are players in Squared’s multi-year RAPM that have large samples, but the samples for their teammates and particularly their opponents are often much smaller, which makes the data much less reliable than it would be if everyone’s sample was much larger.

I don’t think this should be a very controversial point. The only question is exactly how significant it is. I’m not certain exactly how big the issue is, but when even the large playoff samples are mostly ~2 seasons of data, and even normal 2-season RAPM is too noisy for my tastes, I’m pretty confident that career playoff RAPM has a serious noise issue.

I should also note that none of this is even getting to the other big problem with career playoff RAPM—which is that it has the same major problem that Engelmann’s 29-year RAPM has. And that is that doing RAPM over that long a time period doesn’t account for the fact that there’s often big differences in how good players are over time. For instance, if you’re on old Shaq’s teams, career RAPM doesn’t understand that he’s not all that good (and conversely, if you’re on peak Shaq’s teams, career RAPM will not quite realize exactly how good he was, because it is lumping in old Shaq’s data in assessing how good he was). This issue is why people tend to use five-year samples for RAPM instead of going longer than that. If you go longer than that, you’re just handwaving a lot of changes in player quality, which introduces a lot of error. So this is an issue that adds even more noise/issues to the adjustments made in 29-year career playoff RAPM. To be fair, I’m not sure whether Engelmann did anything to try to mitigate this issue, but my understanding is that he did not do any aging curve, so I don’t think he did.

By contrast, playoff EPM uses box and tracking data that are themselves unstable in small playoff samples. Those stats (like rim deterrence, defensive matchup data, and shot quality) are highly volatile in small samples — and EPM still relies on RAPM-derived priors to stabilize its impact component. So it’s not an independent fix; it’s standing on the same foundation you’re rejecting, just with fewer games and more noise per possession.


That other data is not as noisy in small samples as RAPM is. That’s precisely the point of hybrid measures! They seek to estimate impact in small samples by taking noisy small-sample RAPM and stabilizing it with less noisy box data. The fact that even box data isn’t super stable in the small sample of the playoffs is part of why there’s really no great measure to assess single-playoff performance. But we’re in a single-year peaks project, so there’s kind of no way around that issue. We do actually have to discuss and try to assess/compare how well players played in single playoff runs. There’s no amazing way to do that because the sample size is small. Playoff EPM is one of the least bad options to do that (as is playoff BPM), and that’s why I’ve been using it. In contrast, a single-year peaks project doesn’t actually require us to use career playoff RAPM despite its flaws.

The claim that career playoff RAPM “isn’t directed at the relevant question” because this is a “single-year peaks” project doesn’t hold either. If the goal is to estimate true player impact in playoff environments, it makes no sense to privilege a noisier, less-stable metric over one that’s explicitly designed to filter out randomness. Multi-year data doesn’t dilute peak performance — it clarifies whether what looks like a peak was genuine impact or variance.


Okay, so I just definitely don’t agree that career playoff RAPM is more than tangentially relevant to assessing how well a player played in one particular playoff run. It’d be like using Engelmann’s 29-year RAPM to assess how good someone was in their peak year. It’s so non-specific and so dependent on data from years where the players often were significantly less good than they were in their peak year that it’s just not super relevant.

You say “If the goal is to estimate true player impact in playoff environments,” but that isn’t the exercise here. If we were trying to assess how good a player was as a playoff player in general in their career, then yeah, I agree that career playoff RAPM would be on point (but still a flawed measure, of course). But we’re essentially aiming to assess how well a player played in the playoffs in one particular playoff run, not to generally assess their “impact in playoff environments.” For the exercise here, the most directly on-point thing would be data specifically about the one playoff run in question. One might reasonably look at what happened in other playoff runs to expand out the sample, but even with that approach it’s pretty extreme to expand out the sample to every playoff game the player ever played (regardless of whether they were even in their prime years). It’s not *completely* irrelevant, but it’s definitely tangential.

As for your point about other players’ small samples increasing noise — that’s precisely why multi-year data helps. You’re correct that RAPM’s reliability depends on everyone’s sample size, not just one player’s. But that’s an argument for aggregation, not against it. By pulling multiple years together, you increase the effective sample not only for the player in question but also for every player he shares the floor with. This stabilizes the entire adjustment matrix.


Okay, so I see here that you’re agreeing with my point about other players’ small samples increasing noise. Which makes me confused as to what you’re arguing with me about.

If the question were whether single-playoff RAPM is more or less noisy than career playoff RAPM, then the answer is definitely that single-playoff RAPM is more noisy. No question about it. But I’m not entirely certain that a single-playoff hybrid measure is more noisy than career playoff RAPM. A single-playoff hybrid measure has a very noisy RAPM component and stabilizes it with less noisy box data. Does that box data stabilize the measure more or less than the aggregation of more data does for career playoff RAPM? I’m not sure. And, beyond the question of noise specifically, how much of a problem is introduced by the fact that career playoff RAPM handwaves changes in player quality over time? I’m not sure exactly how big a problem that is either, but it does seem significant! Ultimately, I think both playoff EPM and career playoff RAPM are subject to a good deal of serious issues. And, as you should be aware of now, I’ve called out those issues multiple times regarding each measure. But, to repeat myself, the reason I’d use playoff EPM in this project and not career playoff RAPM is that playoff EPM is a flawed measure that is directly on point to a peaks project, while career playoff RAPM is a flawed measure that is not actually directly on point.

Meanwhile, you continue to give single-playoff EPM, BPM, and even single-series on/off splits some analytical weight — all of which are drastically smaller samples and carry no structural correction for opponent, teammate, or lineup noise. You can’t simultaneously say multi-year RAPM is too noisy to use and then turn around and cite single-year or single-series data as “one of the best measures we have.” That’s internally inconsistent.


Yeah, so I think the way you’re using that quote of me is really misleading. Here’s the full sentence that that came from: “But if the question is how a player performed in a particular playoffs (which is certainly the most relevant question in a peaks project), then it’s one of the best measures we have for that, despite not being all that great.” You’re criticizing me for saying certain single-year data is “one of the best measures we have” because you say it’s better to have multi-year data, but I was explicitly saying it is one of the best measures we have specifically for making a *single-year* assessment!

Saying playoff EPM is one of the best measures we have for assessing how a player performed *in a particular playoffs* while also saying that it’s not all that great a measure is definitely not inconsistent with saying that career playoff RAPM is really noisy. Because, after all, career playoff RAPM is absolutely *not* a measure for assessing how a player performed in a particular playoffs. I feel like if you did the work to pull out the quote of that exact phrase, you should’ve been aware that you were deleting extremely important context that made your use of the quote misleading.

The reliability hierarchy is well-established in the literature:
multi-year RAPM > single-year RAPM ≈ hybrid box/impact (like EPM) > raw BPM/on-off in small samples.


Umm…I definitely don’t think single-year RAPM is more reliable than single-season hybrid data. The fact that single-year RAPM is incredibly noisy is precisely why hybrid measures exist! Maybe I’m misunderstanding you here, but that just seems like quite a hot take.

Bottom line: you’re critiquing multi-year playoff RAPM for the very problem it was built to solve, explicitly agreeing with the statistical logic behind its reliability, and then rejecting it anyway in favor of smaller-sample alternatives that rely on the same priors. That’s not methodological rigor — that’s just arguing in circles


Nope, I’m critiquing career playoff RAPM for having a problem that cannot be solved (playoffs are just a small sample for virtually every player) and refusing to use it for a question that it is not at all aimed at assessing (i.e. how well a player played in the playoffs in their peak year), and I have used data that is also flawed but is at least actually aimed at assessing the relevant question (and, despite being flawed, is one of the best measures we have to get at that specific question).
OhayoKD wrote:Lebron contributes more to all the phases of play than Messi does. And he is of course a defensive anchor unlike messi.
lessthanjake
Analyst
Posts: 3,477
And1: 3,109
Joined: Apr 13, 2013

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#89 » by lessthanjake » Fri Oct 10, 2025 2:23 am

homecourtloss wrote:
lessthanjake wrote:
homecourtloss wrote:
You've cited single playoff EPM numerous times in this project without any of the cautionary "this is basically useless" warnings but you have the most reservations for the multi-year playoffs RAPM. That's called inconsistency to push agendas.


Yeah, instead of proving once again that I was right that you are unable to stop yourself from making rude comments directed at me, you might want to familiarize yourself with my posting history and stop making baseless straw-man attacks. The reality is that I have actually caveated playoff EPM with similar caution, even when it supported my conclusion.

Let’s look at the following, for example, which is a post in which I cited playoff EPM to support my conclusion that Jokic was 2024 POY, but in doing so specifically called out that playoff EPM is reliant on small samples:

https://forums.realgm.com/boards/viewtopic.php?p=113775517#p113775517

Here’s the relevant quote from me: “Playoff EPM bears out Jokic being the best (by a clear margin, though it has Luka slightly above SGA), though obviously that’s reliant on low-sample-size data.”

Here’s another example of me doing a similar thing:

https://forums.realgm.com/boards/viewtopic.php?p=117687856#p117687856

In that post, I provided playoff EPM data but caveated it as follows: “FWIW, while playoff impact data is low-sample-size data that shouldn’t be taken all that seriously even when using box/tracking data to help stabilize it, here’s their playoff EPM in those years.”

It took me all of about 2 minutes to find those posts. You might’ve tried to do the same before you posted rude posts directed at me that were clearly baseless. But you didn’t. And to the extent your criticism is that, even though I have cautioned about playoff EPM when citing to it in the past on these forums, you are upset that I have not specifically done that when I’ve mentioned it in this project, I’d say that that is pretty clearly just a silly criticism. And I’d also say that this is a single-year peaks project, so to some degree it just goes without saying that data relating to a specific playoff run is not super reliable, because that’s just the nature of the beast with single-playoff data but we can’t avoid single-playoff data in this project. The same isn’t really true of career playoff RAPM which I don’t think is very reliable but is also something we can easily avoid using because it is not specific to a given peak year.


I don't want to derail this any further, but now you are trying to excuse what's already been exposed. You have not "caveated" the use of single season EPM the way that you did for multi-year playoffs RAPM even though the latter is more stabilized and more robust.

"though obviously that’s reliant on low-sample-size data" is not anywhere close to what you wrote about multi year RAPM, which you dismissed straight away, while you use noisier single playoff season EPM numbers to evaluate players (even though as I have pointed out, the impact measures from EPM come from RAPM anyway).

lessthanjake wrote:Given how tiny playoff off samples are, I think this is mostly extremely noisy to the point of being not very useful, except perhaps to look at the small number of people with tons of playoff games in this data set (guys like LeBron, Steph, Draymond, Duncan, and Ginobili), and even some of those wouldn’t be much less noisy than single-season RAPM (which is super noisy). And it’s actually probably even worse than just looking at the off minutes would lead you to believe, because, even for those players with lots of playoff minutes, the model will be trying to control for the presence of other players that have a very small amount of playoff data. So, basically, the adjustments the model makes compared to on-off will be mostly based on garbage.


Okay, I’m sorry but this is just crazy. First you assert that I haven’t provided caveats or cautions regarding playoff EPM. Then, when confronted with the fact that that was completely untrue, you not only don’t apologize but instead move on to policing the exact wording I used to caveat different types of data? And apparently you’ve concluded that I should be criticized for not adequately caveating playoff EPM when I have provided the following strongly-worded caution about playoff EPM: “playoff impact data is low-sample-size data that shouldn’t be taken all that seriously even when using box/tracking data to help stabilize it”??? I think this is very clearly the behavior of someone who is just reaching in whatever way he can to criticize me. Just wild stuff, to be honest. I think it may be worth taking a step back and really assessing whether you honestly think you’re making fair points here (and also whether constantly chiming in to criticize a particular poster’s approach—often in rude terms—is a good use of your time).
OhayoKD wrote:Lebron contributes more to all the phases of play than Messi does. And he is of course a defensive anchor unlike messi.
Cavsfansince84
RealGM
Posts: 15,219
And1: 11,619
Joined: Jun 13, 2017
   

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#90 » by Cavsfansince84 » Fri Oct 10, 2025 2:24 am

f4p wrote:
the methodology is a huge part of it, though. if we're just going by who won a title, that's going to give different results than who played better. if we're throwing out playoff impact data, that's going to give a different result than using playoff impact data.

what i'm not open minded to is arguments that eric gordon and clint capela are each better than draymond green.


I mean I think it's fine to discuss methodology to some degree and I do value years in which guys win titles but a. it's a small pretty group who are the best player on title teams to begin with and b. I tend to try and use all available data. Thus why I voted for non title years for both LeBron and Kawhi. We all have our own idea(s) of what makes a season stand out and seem better than another or one player better than another.
User avatar
eminence
RealGM
Posts: 17,123
And1: 11,909
Joined: Mar 07, 2015

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#91 » by eminence » Fri Oct 10, 2025 2:26 am

Does reliability not mean reliability any more?

The entire point of the xRAPM variants is that they smack the more pure apm/rapms in reliability.
I bought a boat.
Doctor MJ
Senior Mod
Senior Mod
Posts: 53,682
And1: 22,631
Joined: Mar 10, 2005
Location: Cali
     

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#92 » by Doctor MJ » Fri Oct 10, 2025 3:57 am

f4p wrote:
Doctor MJ wrote:Re: "I see people". I believe you, but just keep in mind how toxic it is to assert that everyone is the same in the same bad way.
.


I mean venting frustration with generic comments like these seems better than naming names doesn't it? You're mostly in the in-clique of opinion around here so you won't understand, but while this board has a high level of basketball discussion, it's also a small, insular place (only moreso over time when we compare recent project results to past ones) prone to what feels like a lot of groupthink and at some point you get tired of bashing your head against that wall.


Well, I do understand frustration, as well as the frustration of estrangement within a community. Hard to stay positive.
Getting ready for the RealGM 100 on the PC Board

Come join the WNBA Board if you're a fan!
Doctor MJ
Senior Mod
Senior Mod
Posts: 53,682
And1: 22,631
Joined: Mar 10, 2005
Location: Cali
     

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#93 » by Doctor MJ » Fri Oct 10, 2025 4:04 am

eminence wrote:Does reliability not mean reliability any more?

The entire point of the xRAPM variants is that they smack the more pure apm/rapms in reliability.


Yeah seems like a good moment to bust out the old archery diagram:

Image

By validity: APM > RAPM > XRAPM > Box
By reliability: Box > XRAPM > RAPM > APM

I miss long-sample classic, un-regularized, un-Xploited APM studies.
Getting ready for the RealGM 100 on the PC Board

Come join the WNBA Board if you're a fan!
f4p
Sixth Man
Posts: 1,921
And1: 1,900
Joined: Sep 19, 2021
 

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#94 » by f4p » Fri Oct 10, 2025 5:33 am

lessthanjake wrote:If you’ve been trying to rely on box data over multi-season samples and impact data in small playoff samples, then you’re kind of just getting it backwards.


I've literally been arguing for the stability of box score data for like 3 years here. It never seems to break through to most people.


So I don’t really think this is exactly right. The basis for your conclusion that Durant beats Steph in playoff impact is a career playoff RAPM measure. But when it comes to box score, Steph has a higher career playoff BPM!


The box score isn't just BPM. That's literally just another stat ending in PM. If all the stats end in PM, they're probably mostly the same thing.

Either way, Durant's best stretch (age 22 to 32) is 25 PER, 0.205 WS48, 7.4 BPM while Steph's (age 26 to 36) is 23.5 PER, 0.195 WS48, 7.4 BPM. So slight advantages for KD. And as I mentioned in this thread, in terms of the higher end, KDs 6th best PER exceeds Steph's 2nd best, his 7th best WS48 beats Steph's 3rd, and his 6th best BPM is basically tied with Steph's 3rd and same with TS%. So i guess KD must be giving you some lower end playoffs as well but more high end ones in general, including beating Steph in all 3 stats in all 3 of their seasons together (by fairly huge margins in 2019).


. Second, a lot of people’s eye test prefers Steph in the playoffs due to the gravity stuff and seeing on film how much he opened up things for Durant.


Well yes, Steph fans do tend to use gravity to explain everything. Since it's a somewhat vague quality that can be what we want it to be. But again it's weird that Steph is the one who opened it up for Durant when Steph is the one who saw by far his best playoff stats in 2017 and even has that considered his peak year. It's like KD playing well next to Steph is because of Steph but Steph playing his best ever next to KD is...also because of Steph.

Third, Steph has actually won titles without Durant and the opposite is not true. I know you’d say that that’s just due to differences in circumstance/opponents/etc. And that may be partially right (for instance, maybe Durant would have a title outside Golden State if Harden and Kyrie had been healthy in 2021), but I think people do have a view that Durant has played with a lot of talent outside of the Warriors and hasn’t managed to do as much with it as Steph did without Durant.


A bunch of unhealthy or non prime talent. And no injured opponents I can think of either. I mean we can't just be making rings arguments on this board. Steph got klay and dray (and KD) during their prime years and they never missed playoff games and the one time someone did (KD) they didn't win (or I guess Draymond missing all of one 1 game also made them not win). Not only do we have the Kyrie/love and harden/Kyrie injuries helping Steph/hurting KD but I didn't even mention 2013, when the thunder were a +9 team and then Westbrook got hurt 4 games into the playoffs. Another great chance down the drain. Having basically all of the best non-warriors years for KD being ruined by injury kind of explains a lot. Steph didn't have any serious chances ruined by other people's injuries.

I think you can get to a more charitable view of Durant in that comparison if you decide (1) you just think the playoffs is different and so Steph being a much more impactful regular season player is irrelevant to playoff performance; (2) your eye test just doesn’t prefer Steph; and (3) you take a particularly positive view of Steph’s supporting casts


And I do. And so should everyone. We can't be talking about voting Draymond as a top 20 peak over some guys who actually won MVPs and then see guys like klay or a high impact guy like Iggy on the warriors and not conclude that the warriors supporting cast was anything but amazing. Or seeing that the 2015 to 2019 warriors had a better playoff winning percentage and better postseason SRS WITHOUT Steph than with him.


,
a negative view of Durant’s supporting casts, or the opposite about their playoff opponents, such that you conclude that the difference in success without the other is entirely a result of differences in circumstances. If you combine all three beliefs at once, then that’s fine. But I don’t think most people do.


Well yes, because Steph is a generationally-liked athlete, probably having a higher Q score or whatever it's called than any NBA player since Jordan, and KD gets a ton of hate. I mean people literally call KD soft for wanting an easy title with the warriors and then don't say the same thing about Steph, LIKE THEY DIDN'T PLAY ON THE SAME TEAM! Like that's some crazy mass cognitive dissonance.


or a favorite like nash looks like harden just worse across the board but the best stat BPM is secretly against him. even though there's literally no mathematical reason given why it's so. just the creator of the stat loves nash and says it underrates him. how that's any different than me or you declaring someone to be underrated by a stat is beyond me.


The reason for thinking BPM underestimates Nash is that his large-sample RAPM is dramatically better than his BPM, and large-sample RAPM is a really good stat, so if there’s a big difference between what large-sample RAPM says and what a box metric says, it makes sense to side with the large-sample RAPM. The fact that the creator of BPM feels the same way about it and specifically included it in the About BPM page just shows that even the guy who created the measure thinks that, when they disagree significantly, large-sample RAPM should be believed over his measure.


Then why quote BPM at all? You can't create a stat and then say "well if it doesn't line up with this other stat, just use the other stat.". That'd be like one of those hurricane model people being like "Wait, Model A said something different, oh yeah then just ignore ours, you guys are gonna be fine." Presumably BPM is supposed to add to the conversation, not just copy it or be ignored when I convenient. BPM was made how it was made to predict/model what it could predict/model. It likes some people and doesn't like others. It doesn't like Nash, end of story. Not throw it out.

And since I’ve already explained this to you, I find the statement you’ve made here a bit frustrating. You know full well that what I’ve said is not akin to “me or you declaring someone to be underrated by a stat,” and you also know that the BPM creator’s statement wasn’t “just the creator of the stat lov[ing] nash and say[ing] it underrates him.” This isn’t just people saying the measure has Nash come in too low without any reasoning beyond that they like Nash. It’s that large-sample RAPM is way higher on Nash than BPM is. That’s the “mathematical reason” to think BPM underrates him. And it’s actually a very good reason.


Then BPM is just RAPM. And so is EPM. And most of the others. If everything valuable is apparently just trying to mimic RAPM, then the whole analysis is just RAPM. Which is also only a regular season stat. It's too limited.
lessthanjake
Analyst
Posts: 3,477
And1: 3,109
Joined: Apr 13, 2013

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#95 » by lessthanjake » Fri Oct 10, 2025 1:36 pm

f4p wrote:

So I don’t really think this is exactly right. The basis for your conclusion that Durant beats Steph in playoff impact is a career playoff RAPM measure. But when it comes to box score, Steph has a higher career playoff BPM!


The box score isn't just BPM. That's literally just another stat ending in PM. If all the stats end in PM, they're probably mostly the same thing.

Either way, Durant's best stretch (age 22 to 32) is 25 PER, 0.205 WS48, 7.4 BPM while Steph's (age 26 to 36) is 23.5 PER, 0.195 WS48, 7.4 BPM. So slight advantages for KD. And as I mentioned in this thread, in terms of the higher end, KDs 6th best PER exceeds Steph's 2nd best, his 7th best WS48 beats Steph's 3rd, and his 6th best BPM is basically tied with Steph's 3rd and same with TS%. So i guess KD must be giving you some lower end playoffs as well but more high end ones in general, including beating Steph in all 3 stats in all 3 of their seasons together (by fairly huge margins in 2019).


Okay, so you’re again changing the timeframes depending on what you’re talking about. You’re saying Durant is better in playoff impact and playoff box but using a career playoff impact number while not using career box numbers. Which seems pretty relevant, since Steph is actually ahead in two of the box measures you list if you’d actually taken their entire playoff career.

And yeah, box score isn’t just BPM, but it’s actually widely considered a better measure than WS or PER (primarily since it is aimed at correlating with large-sample RAPM rather than just having completely arbitrary weights). Also, from what I can see (which is everything but 2024 and 2025), Steph also looks slightly better in Thinking Basketball’s playoff BPM over the course of their playoff careers (and with a peak year that is higher by a miniscule looks-the-same-after-rounding amount). That is another box measure that is widely considered better than WS or PER. I wouldn’t say WS or PER are totally useless, but those other measures are better. In any event, though, all these measures have their box numbers very close either way, so the idea that Durant is materially ahead in playoff box data is just wrong. And ultimately, in order to say Durant looks better in playoff impact and playoff box measures, you’re having to do some sleights of hand to talk about different timeframes for each, because, as I said, they basically flip who is ahead in what depending on the timeframe used.

. Second, a lot of people’s eye test prefers Steph in the playoffs due to the gravity stuff and seeing on film how much he opened up things for Durant.


Well yes, Steph fans do tend to use gravity to explain everything. Since it's a somewhat vague quality that can be what we want it to be. But again it's weird that Steph is the one who opened it up for Durant when Steph is the one who saw by far his best playoff stats in 2017 and even has that considered his peak year. It's like KD playing well next to Steph is because of Steph but Steph playing his best ever next to KD is...also because of Steph.


Yeah, I think Durant being on the team made things easier for Steph. It’s less manifestly obvious on the film than Steph making things easier for Durant, but it’s surely true. This is reflective of the synergy that they had together. And that kind of synergy between two great players is really not guaranteed and should improve our assessment of *both* guys. For present purposes (since Steph has rightly long since been voted in), I think that being able to take an amazing team and make them even better (particularly in 2017) through having good synergy with the other major superstar on the team is definitely a feather in Durant’s cap. And it’s a particularly relevant feather in his cap for a peaks project, because 2017 is probably the year most people will identify as his peak.

Third, Steph has actually won titles without Durant and the opposite is not true. I know you’d say that that’s just due to differences in circumstance/opponents/etc. And that may be partially right (for instance, maybe Durant would have a title outside Golden State if Harden and Kyrie had been healthy in 2021), but I think people do have a view that Durant has played with a lot of talent outside of the Warriors and hasn’t managed to do as much with it as Steph did without Durant.


A bunch of unhealthy or non prime talent. And no injured opponents I can think of either. I mean we can't just be making rings arguments on this board. Steph got klay and dray (and KD) during their prime years and they never missed playoff games and the one time someone did (KD) they didn't win (or I guess Draymond missing all of one 1 game also made them not win). Not only do we have the Kyrie/love and harden/Kyrie injuries helping Steph/hurting KD but I didn't even mention 2013, when the thunder were a +9 team and then Westbrook got hurt 4 games into the playoffs. Another great chance down the drain. Having basically all of the best non-warriors years for KD being ruined by injury kind of explains a lot. Steph didn't have any serious chances ruined by other people's injuries.


The bolded is just completely insane. Like, Steph obviously had a very serious chance ruined by Durant getting injured in the 2019 playoffs. And if you want to discount that because we’re talking about what they did without the other, then you’re still confronted with the obvious fact that Klay Thompson got injured in those finals too (and, in the games Klay played, the finals were 2-2 and the Warriors ahead in another game before Klay went out—so we have good reason to think they could’ve won that year with Durant injured if Klay hadn’t gotten injured too). That’s not even mentioning that Durant being injured is obviously worse than Durant not being on the team, because the team’s depth had been eviscerated due to the salaries of the top guys including Durant. So yeah, the bolded is a completely insane assertion that I think should cause you to step back and think about whether you’re making fair points or just being a hater.

That’s also not even getting into the fact that Klay Thompson did not play in 2021—the year before they won another title. Nor is it mentioning that Klay was never the same after the injury. Of course, Steph doesn’t actually need to use that as an excuse, though, because he won a title anyways, despite one of his best teammates turning into a shell of himself after awful injuries.

But yeah, to get on topic and away from discussion rooted in gratuitous Steph Curry hating despite Steph already being voted in, I do think it’s fair to point out that some of Durant’s years have been marred by injuries to other players on his team. That’s certainly true in 2013 and 2021—which are both years where it’s plausible his team could’ve won a title if healthy. He has actually had some other chances on some genuinely talented teams outside Golden State, though, and unlike other players, he did not convert on them. If you want to say that he just faced abnormally tough opponents in those years, then that’s not a crazy point. His teams did lose to the title-winner in a bunch of years. But you only play the title winner if you lose to them, and most people will naturally prefererence winning titles over repeatedly losing to really good teams, even if the latter isn’t really a black mark.

or a favorite like nash looks like harden just worse across the board but the best stat BPM is secretly against him. even though there's literally no mathematical reason given why it's so. just the creator of the stat loves nash and says it underrates him. how that's any different than me or you declaring someone to be underrated by a stat is beyond me.


The reason for thinking BPM underestimates Nash is that his large-sample RAPM is dramatically better than his BPM, and large-sample RAPM is a really good stat, so if there’s a big difference between what large-sample RAPM says and what a box metric says, it makes sense to side with the large-sample RAPM. The fact that the creator of BPM feels the same way about it and specifically included it in the About BPM page just shows that even the guy who created the measure thinks that, when they disagree significantly, large-sample RAPM should be believed over his measure.


Then why quote BPM at all? You can't create a stat and then say "well if it doesn't line up with this other stat, just use the other stat.". That'd be like one of those hurricane model people being like "Wait, Model A said something different, oh yeah then just ignore ours, you guys are gonna be fine." Presumably BPM is supposed to add to the conversation, not just copy it or be ignored when I convenient. BPM was made how it was made to predict/model what it could predict/model. It likes some people and doesn't like others. It doesn't like Nash, end of story. Not throw it out.

And since I’ve already explained this to you, I find the statement you’ve made here a bit frustrating. You know full well that what I’ve said is not akin to “me or you declaring someone to be underrated by a stat,” and you also know that the BPM creator’s statement wasn’t “just the creator of the stat lov[ing] nash and say[ing] it underrates him.” This isn’t just people saying the measure has Nash come in too low without any reasoning beyond that they like Nash. It’s that large-sample RAPM is way higher on Nash than BPM is. That’s the “mathematical reason” to think BPM underrates him. And it’s actually a very good reason.


Then BPM is just RAPM. And so is EPM. And most of the others. If everything valuable is apparently just trying to mimic RAPM, then the whole analysis is just RAPM. Which is also only a regular season stat. It's too limited.


I think you’re actually close to getting it. BPM is not just RAPM, but it is specifically designed to approximate/correlate with large-sample RAPM. It is designed that way because large-sample RAPM is really good. You’re essentially asking what the point of BPM is if it’s just meant to mimic RAPM. Well, the answer to that is that it can be used in situations where RAPM is either unavailable or cannot get to the precise question without being too noisy. So, for instance, if we’re trying to assess how well a player played in one particular season, large-sample RAPM is not specific to that question, and single-season RAPM is really noisy. So RAPM isn’t all that great a measure to answer that question. Therefore, it might make sense to instead use a measure that’s not nearly as noisy as RAPM in that sample size and that we know correlates with large-sample RAPM. That’s what BPM provides. Furthermore, BPM’s biggest use case over RAPM is probably the fact that it exists prior to the play-by-play era. RAPM doesn’t exist prior to 1997. So if we are trying to assess the impact of a player prior to 1997, there’s no option to use RAPM but what we can do is use a box measure that we know correlates well with large-sample RAPM. Again, that’s what BPM provides. So yeah, even if large-sample RAPM is better than BPM, that does not mean that there’s no point in BPM.

To circle back to the actual subject we were talking about, I think the best response you could give here is to say that you had been talking about Nash’s playoff BPM, and the playoffs are a scenario where we don’t have a large enough sample for RAPM so isn’t BPM actually the best thing to use to assess Nash’s playoff performance? I think that’d be a fair point. The issue I raised is that we basically can be pretty certain that BPM underestimates Nash in large samples, so it seems reasonable to think it underestimates him in smaller samples too. It’s possible it doesn’t though! It’s possible that his impact went down a lot in the playoffs without his BPM going down (note: his RS and playoff BPM in the 2005 year I’ve voted for was 4.7 for both, and his BPM on the Suns in the playoffs from 2005-2010 was 4.3 while it was 4.4 in the regular season in that timeframe). In that case, he wouldn’t be underestimated by BPM as much in the playoffs as he was in the regular season. But, for me, that’s where the eye test comes in—since I watched every single playoff game Nash played for the Suns, and what I saw from him was amazing. The other thing is that the Suns offense in the playoffs was objectively incredible—even better than it was in the regular season. Indeed, the Suns rORTG with Nash on the floor in the playoffs was a ridiculous +12.58 while it was a lower-but-still-incredible +10.65 in the regular season. And in the 2005 playoffs specifically, Nash had a better on-court rORTG in every single playoff series than he had had in the regular season (which is made all the more remarkable by the fact that Nash’s on-court rORTG in the 2005 regular season is the highest on record). Given that Nash had a ton of control over the Suns offense, the Suns getting even better offensively in the playoffs does not seem consistent with Nash actually having his impact go down in the playoffs. So yeah, I look at this and think that the most likely thing going on here is that playoff BPM is underestimating Nash just like regular-season BPM pretty clearly does in larger samples.
OhayoKD wrote:Lebron contributes more to all the phases of play than Messi does. And he is of course a defensive anchor unlike messi.
Doctor MJ
Senior Mod
Senior Mod
Posts: 53,682
And1: 22,631
Joined: Mar 10, 2005
Location: Cali
     

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#96 » by Doctor MJ » Fri Oct 10, 2025 2:23 pm

f4p wrote:Then BPM is just RAPM. And so is EPM. And most of the others. If everything valuable is apparently just trying to mimic RAPM, then the whole analysis is just RAPM. Which is also only a regular season stat. It's too limited.


If you look at my archery post, note that the stats that are most valid are the least reliable and vice versa, which has everything to do with why we can't use just one.

I also want to note that I see BPM as in the same category as PER, WS, WP, and even PIPM. All of this is basically just attaching weights to things that are either in the traditional box score, or could be attached to a player's box score as kinds of production. Doesn't mean they are equally good - Ben did a retrodiction study years back where where he pitted stats like this against something that just factored in team SRS & individual MP, and PER & WP were actually worse than the dummy stat - but in the end, they have the same core concept and are just using different methods for determining the weights.

And I'll say that in the long term, having a PIPM based on detailed player tracking is one of the things I see the most potential for. It can never be more valid than APM, but it can not just approximate that while having much greater reliability, while also being used to develop sub-holistic component values which will excellent for evaluating player combinations when considering player acquisitions.
Getting ready for the RealGM 100 on the PC Board

Come join the WNBA Board if you're a fan!
lessthanjake
Analyst
Posts: 3,477
And1: 3,109
Joined: Apr 13, 2013

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#97 » by lessthanjake » Fri Oct 10, 2025 2:58 pm

Doctor MJ wrote:
f4p wrote:Then BPM is just RAPM. And so is EPM. And most of the others. If everything valuable is apparently just trying to mimic RAPM, then the whole analysis is just RAPM. Which is also only a regular season stat. It's too limited.


If you look at my archery post, note that the stats that are most valid are the least reliable and vice versa, which has everything to do with why we can't use just one.


Yeah, ultimately the choice of what types of measures to rely most on in different situations comes down to an assessment of the tradeoff between reliability and validity in that particular scenario.

The way I conceptualize it is basically that box stats have an advantage in reliability and a disadvantage in validity, but their reliability advantage gets smaller and smaller the larger the sample. So, at a certain sample size, the advantage is outweighed by the disadvantage. This leads to an intuition that APM-style stats are better in larger samples, while box stats are better in smaller samples. This gets a bit complicated by the fact that I do think that APM-style stats start getting a bit less validity the larger the sample is (because it starts handwaving more and more changes in player quality over time). I’m not really sure how big that effect is, but it does give reason to not prefer the largest possible samples for APM-style stats, because at a certain point the increase in reliability by expanding the sample is outweighed by the decrease in validity. Whether this decrease in validity in really large samples is enough to actually make box stats better than APM-style stats when the samples get super large is a valid question that I don’t feel strongly about. The final thing I’d note is that, with APM-style stats, we can also turn up the reliability and turn down the validity by using a sample that is for a larger timeframe than the question at hand calls for (i.e. for instance, we might prefer using three-year RAPM to assess single-year impact, if we think the increase in reliability from the larger sample outweighs the decrease in validity from measuring based on a couple years that aren’t actually at issue). We can do this with box stats too, but it’s less useful a tool, since the increase in reliability isn’t as significant when you increase the sample for box stats.

As it relates to this question of career playoff RAPM vs. playoff EPM, I think they both have serious reliability issues. Playoff EPM’s impact component is definitely quite unreliable, but its use of box stats improves its reliability. Career playoff RAPM has a larger sample so it is more reliable than playoff EPM’s impact component, but the sample is still small and it does not have box data improving its reliability. So I’m not sure which one is more reliable. However, career playoff RAPM is also the most extreme version of both: (1) turning down the validity by using a larger timeframe than the question at issue; and (2) validity decreasing due to super large timeframes handwaving differences in player quality over time. So I think there’s some really serious validity issues with career playoff RAPM, particularly when it comes to a peaks project. To be fair, playoff EPM inherently has validity issues too, since its use of a box component decreases its validity. On balance, though, I think in the context of a single-year peaks project, you’re getting a lot more validity from playoff EPM without necessarily having worse reliability. So I’ve cited to playoff EPM in this project and don’t think career playoff RAPM is particularly useful here.
OhayoKD wrote:Lebron contributes more to all the phases of play than Messi does. And he is of course a defensive anchor unlike messi.
User avatar
eminence
RealGM
Posts: 17,123
And1: 11,909
Joined: Mar 07, 2015

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#98 » by eminence » Fri Oct 10, 2025 3:14 pm

Doctor MJ wrote:By validity: APM > RAPM > XRAPM > Box
By reliability: Box > XRAPM > RAPM > APM

I miss long-sample classic, un-regularized, un-Xploited APM studies.


I'd add raw plus/minus and on/off on the outside of APM even, as the most valid stats, but suffering from extreme volatility.
I bought a boat.
Doctor MJ
Senior Mod
Senior Mod
Posts: 53,682
And1: 22,631
Joined: Mar 10, 2005
Location: Cali
     

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#99 » by Doctor MJ » Fri Oct 10, 2025 3:25 pm

eminence wrote:
Doctor MJ wrote:By validity: APM > RAPM > XRAPM > Box
By reliability: Box > XRAPM > RAPM > APM

I miss long-sample classic, un-regularized, un-Xploited APM studies.


I'd add raw plus/minus and on/off on the outside of APM even, as the most valid stats, but suffering from extreme volatility.


Reasonable stuff to bring up. I wouldn't say those stats are more valid than APM as I don't think there's anything inherently invalidating about applying vanilla regression, but I wouldn't object to seeing them as equally valid to APM with considerably worse reliability.
Getting ready for the RealGM 100 on the PC Board

Come join the WNBA Board if you're a fan!
ShotCreator
Assistant Coach
Posts: 3,836
And1: 2,545
Joined: May 18, 2014
Location: CF
     

Re: Top 25 peaks of the 2001-25: #13-#14 Spots 

Post#100 » by ShotCreator » Fri Oct 10, 2025 3:53 pm

f4p wrote:
Cavsfansince84 wrote:
lessthanjake wrote:
Yeah, so I’d say Embiid was definitely far worse in the 2022 and 2023 playoffs than the best years for the players you mentioned. Like not even close. Voting for those years requires voting for a worse playoff performance than I’d vote for any other player.

The same isn’t actually true of 2024 IMO—Embiid actually performed pretty well in the playoffs that year despite health issues (I can’t even remember what was wrong with him that year, but I’m sure there was something). And that’s the year he was incredible when he played in the RS. But he also only played 39 regular season games and his team lost in the first round of the playoffs. I guess *maybe* it’s a viable year to vote for because he actually was genuinely great when he played that year, but not sure I think it makes sense to vote for for 39 regular season games and a first-round series, no matter how well the guy played in those games.

As I said, maybe 2021 is a potential year too. It’s his only prime year that doesn’t have a super glaring problem. But even then he missed a lot of regular season games and lost in the second round to a weak team. So it’s not exactly a banner year that I’d be voting for soon.


I think 2021 has to be the year for Embiid despite missing 21 games in the rs which is prob slightly more in a full 82 game season. It's another 2nd loss but he actually sort of played like his rs self in both series and then loses in 7 to a pretty pedestrian Hawks team. After looking at it more closely I'm now inclined to have him in my group next after the 6 who I think are most deserving. Which would have him at something like 19-22 most likely. I think you can definitely make a case for Embiid over Tatum given that he was actually sort of competing with Jokic and Giannis for bpitw status in the 21-23 years while Tatum was never really in that kind of convo, even after winning a ring and leading an atg team. Not that I'll have him over Tatum but there's an argument.


i'm digging through embiid's stats in a way i haven't before. in 2019, embiid had a +20.5 ON COURT playoff plus/minus. and didn't get out of the 2nd round! while playing 30 mpg. i mean lots of people have crazy on/offs when their team is minus a million with them off the court, but this is a +20 on court! the list of people with a +20 can't be that long and the list of those who didn't win a title must be incredible short. anyone have a way of looking that up?

edit: i haven't even posted this but i'm going to already add an edit. in the final 5 games against the raptors that year, embiid has a game where he was +17 on court, and lost! and in game 7, he was famously +10 in 45:12 of game time, but lost because philly was -12 in 2:48 with him off the court. and all of that pales in comparison to game 6, where embiid was +40 (!!) in 35 minutes and the sixers were -29 in 13 minutes!

in the finals 5 games, embiid was +82 on the court and his team lost 3 times! like literally, has that ever happened to anyone else? he's +82 in 35 mpg with him on, -93 in 13 mpg with him off. an incredible +22.5 per 48 on and -68.7 per 48 off, but a +91.2 on/off in the 5 most important playoff games of his career probably. he is snakebit beyond belief.

IIRC Kawhi was scoring around 45 points per 36 on more Han 10% above league average TS% when Embiid was off the court.

On the flip side, his production was much more in line with his RS averages when Embiid was on court against him, and obviously, Philly was dominating when Embiid was on the court.

Saw the same extreme splits with Curry vs LeBron in the 16 finals. Curry killed Cleveland when LeBron was off-court.

Defense scales very, very high in this sport. Much higher than offense on an individual level. From Bill Russell, to D-Rob in the late 90’s to late prime KG, to Embiid in 2019.

That’s why there’s no in hell a guy like peak Garnett is ever realistically gonna be worse than any guard in a peak comparison to me.

There’s a realistic scenario where KG is all-star guard level on offense, and figures out an opposing offense to this borderline mythical level in a playoff series. A guard can’t touch that type of impact.
Swinging for the fences.

Return to Player Comparisons