Offensive primes using OBPM (Reggie Miller #3 all time)

Moderators: penbeast0, trex_8063, PaulieWal, Doctor MJ, Clyde Frazier

Jim Naismith
Lead Assistant
Posts: 5,221
And1: 1,973
Joined: Apr 17, 2013

Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#1 » by Jim Naismith » Mon Mar 20, 2017 3:19 am

Here are the number of elite offensive seasons a player has had (where I define elite offensive season by OBPM >= 5.0)

Code: Select all

13 LeBron James   
10 Michael Jordan   
 9 Reggie Miller   
 8 Kobe Bryant   
 8 Chris Paul
 8 John Stockton
 7 Ray Allen   
 7 Charles Barkley      
 7 Magic Johnson   
 6 Clyde Drexler   
 6 James Harden   
 6 Shaquille O'Neal


The high placement of Reggie Miller and Ray Allen really surprises me.

Also the top of the leaderboard doesn't have Bird (4), Wade (3), Dirk (2).

See http://bkref.com/tiny/eGA5j
User avatar
Joao Saraiva
RealGM
Posts: 13,007
And1: 5,809
Joined: Feb 09, 2011
   

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#2 » by Joao Saraiva » Mon Mar 20, 2017 3:51 am

I've never studied BPM a lot I don't remember how it was calculated. So I'm interested in reading somebody who is high on that stat to see what's up with the formula that gives these results.
“These guys have been criticized the last few years for not getting to where we’re going, but I’ve always said that the most important thing in sports is to keep trying. Let this be an example of what it means to say it’s never over.” - Jerry Sloan
User avatar
Bad Gatorade
Senior
Posts: 701
And1: 1,815
Joined: Aug 23, 2016
Location: Australia
   

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#3 » by Bad Gatorade » Mon Mar 20, 2017 9:03 am

Joao Saraiva wrote:I've never studied BPM a lot I don't remember how it was calculated. So I'm interested in reading somebody who is high on that stat to see what's up with the formula that gives these results.


Okay, so, a brief rundown of the stat can be found here:

http://www.basketball-reference.com/about/bpm.html

Basically, BPM is an attempt at regressing historical box score stats onto a large RAPM sample in order to predict RAPM.

It's a pretty good attempt at a stat, but it's integral to understand the nooks and crannies of the stat in order to understand why some players might appear higher/lower than they probably should:

The formula, as shown in the article is this:

a*ReMPG + b*ORB% + c*DRB% + d*STL% + e*BLK% + f*AST% - g*USG%*TO% + h*USG%*(1-TO%)*[2*(TS% - TmTS%) + i*AST% + j*(3PAr - Lg3PAr) - k] + l*sqrt(AST%*TRB%)

And the values for the offensive portion are this:

Code: Select all

Coeff.   Term   O/D BPM Value
a   Regr. MPG   0.064448
b   ORB%   0.211125
c   DRB%   -0.107545
d   STL%   0.346513
e   BLK%   -0.052476
f   AST%   -0.041787
g   TO%*USG%   0.932965
h   Scoring   0.687359
i     AST Interaction   0.007952
j     3PAr Interaction   0.374706
k     Threshold Scoring   -0.181891
l   sqrt(AST%*TRB%)   0.239862


What are the issues with OBPM?

First and foremost, OBPM is based on regression, and unfortunately, this means that the regression can lead to some seriously biased results. Regression is really designed to only interpolate data, rather than extrapolate, as we technically don't know the distribution of data outside of the sample range. So automatically, once people start approaching outlier status in any of the individual terms, oddball things can happen, and OBPM can be severely over/underestimated.

One must consider that extrapolation happens quite a lot - the sample size is from 2001-2014, and any data pre-2000 or post-2014 is prone to extrapolation issues.

I'll talk about some variables of interest -

3PAr Interaction - this dataset was compiled for the 2001-2014 NBA landscape, where the use of the 3 point shot gained more prominence. A player who attempts a lot of 3 pointers before 3 pointers became a thing might be heavily boosted (e.g. Antoine Walker) and a player that doesn't attempt many 3s in the post 2014 era might be unfairly handicapped.

It's also worth noting that 3PAr is also partially meant to act as a proxy for spacing, so players that are excellent floor spacers without shooting many 3s are likely to be underrated. This is especially true considering that floor spacers get less offensive boards. A guy like Walker would get boosted more for "floor spacing" than a guy like say, Nowitzki or Aldridge, and any sane person will tell you that is not the case.

USG*AST and AST*REB - so, these terms are direct multiplication terms, and as a result, extreme results here can throw the regression completely out of whack. The AST*REB term is especially problematic here - I don't have much of an issue with the USG*AST term, because it's not quite as large, and there's more pure merit behind an offensive player being a significant scorer and playmaker.

It's also worth noting that the USG% and AST% terms aren't actually wholly independent - a player taking more shots is not only going to increase his USG%, but the same amount of raw assists will also increase his AST% even if his playmaking hasn't actually improved.

The AST*REB term has a couple of problems -

The DRB term is actually negative, so players that accrue a lot of defensive rebounds are likely to get punished if they don't get a lot of assists too. On the other hand, players that do get a lot of assists can be rewarded if they usurp a lot of their teammates rebounds (please, no arguments about particular players here - that's not the purpose of this post/thread), and players that don't rebound quite as much can be unfairly punished too (e.g. Steve Nash).

In other words, as an interaction term, sometimes accruing more of one statistic actually leads to a lower OBPM, even when that statistic may not directly correlate with offensive play. And when the magnitude of the stat is SO high, this can lead to some silly outliers in both directions.

This is especially prominent now, because Russ * Harden are basically historic outliers here, and this props their OBPM up quite a lot.

Also, much like 3PAr, things such as assist% and rebound% have changed league wide over time, which may slightly blur historical comparisons. I don't think assist% and rebound% era changes alter BPM that much though, just worth mentioning on a technical level.

Threshold scoring - a player who is incredible at improving his teammates efficiency improves his teammate TS%, but depending on the other variables + the player's TS% itself, this might cause silly things to happen to the regression once again. Steve Nash is also probably handicapped here, IMO.

So in a nutshell, what BPM does is regress statistics against RAPM in order to create an optimal fit, and for this fit to be most "accurate" for as many players as possible, it requires some terms that might cause odd results elsewhere, e.g. the reaction terms with defensive rebounding.

Interactions between terms that are only implicitly involved in BPM - a player who takes a lot of long 2s will almost definitely get less ORB% at the expense of spacing, but only the latter is discerned in BPM. Assists to layups/dunks cause more turnovers than assists to jump shooters, but BPM doesn't differentiate between assist types. Contested and uncontested rebounds are treated the same (in particular, some poor defensive perimeter players might be overrated due to high rebound*assist numbers on DBPM, but that's for another day).

Perhaps some of this is better explained by taking proper examples. So I'll bring up some of the players asked about -

Reggie Miller - he was an incredibly efficient scorer (led the league in TS% a couple of times) and bombed 3s at a ridiculous rate. So he's probably a bit of a 3 point/TS% outlier and this may have inflated his value. Of course, we don't have prime RAPM for Reggie Miller, so I can't speak with utmost confidence. It's worth mentioning that Ray Allen's OBPM was 4.3 over the 14 year timeframe, and his ORAPM was 5.1, and given that he's often seen as an 90s era analogue to Ray Allen, perhaps he may have simply been underrated? Who knows.

Larry Bird
- he was primarily a floor spacer, and didn't really start taking any meaningful number of 3s until 1986, so a lot of his floor spacing value probably wasn't captured. Interestingly enough, Bird always had a very high DBPM, so I wonder if some of the regression favoured stats (e.g. defensive rebounding) that likely inflated his DBPM wound up marginalising his OBPM.

Dirk - Dirk is one of the poster boys for being underrated by BPM. He spaces the floor without shooting many 3s (so his ORB% drops too), he was a fairly good defensive rebounder but didn't accrue many assists (and the assists handicap two of Dirk's interaction terms) and his gravity can only really be measured by team TS% and the team adjustment - so it's possible that Dirk's team TS% actually marginalises the impact that Dirk's personal TS% would normally have.

There weren't really any good examples in that leaderboard list, but Stephon Marbury was an example of a guy heavily overrated by OBPM based on his assists. He almost never passed to the interior, so his TOV% wasn't too high, but his assists (which he got heavily credit for) were nigh on worthless on a team success level.

Honestly, this could take all night to do properly, so I think this might be a good idea: In the article that I linked at the start of this post, feel free to look at the Tableau graphic and observe what players seem to be overrated/underrated to you, and try and make some judgments on what might be the case in your opinion. If you're struggling (or if you want verification on your thoughts), feel free to PM me and I'll throw my 2c in on the matter.

Hope that some of this helped!
I use a lot of parentheses when I post (it's a bad habit)
User avatar
Ryoga Hibiki
RealGM
Posts: 11,134
And1: 6,508
Joined: Nov 14, 2001
Location: Warszawa now, but from Northern Italy

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#4 » by Ryoga Hibiki » Mon Mar 20, 2017 10:17 am

I'm really against this kind of of advanced all in one boxscore based stats.
They're just trying to overanalyse numbers that are too noisy to come out with anything significant.
Слава Украине!
User avatar
Joao Saraiva
RealGM
Posts: 13,007
And1: 5,809
Joined: Feb 09, 2011
   

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#5 » by Joao Saraiva » Mon Mar 20, 2017 2:05 pm

Bad Gatorade wrote:
Joao Saraiva wrote:I've never studied BPM a lot I don't remember how it was calculated. So I'm interested in reading somebody who is high on that stat to see what's up with the formula that gives these results.


Okay, so, a brief rundown of the stat can be found here:

http://www.basketball-reference.com/about/bpm.html

Basically, BPM is an attempt at regressing historical box score stats onto a large RAPM sample in order to predict RAPM.

It's a pretty good attempt at a stat, but it's integral to understand the nooks and crannies of the stat in order to understand why some players might appear higher/lower than they probably should:

The formula, as shown in the article is this:

a*ReMPG + b*ORB% + c*DRB% + d*STL% + e*BLK% + f*AST% - g*USG%*TO% + h*USG%*(1-TO%)*[2*(TS% - TmTS%) + i*AST% + j*(3PAr - Lg3PAr) - k] + l*sqrt(AST%*TRB%)

And the values for the offensive portion are this:

Code: Select all

Coeff.   Term   O/D BPM Value
a   Regr. MPG   0.064448
b   ORB%   0.211125
c   DRB%   -0.107545
d   STL%   0.346513
e   BLK%   -0.052476
f   AST%   -0.041787
g   TO%*USG%   0.932965
h   Scoring   0.687359
i     AST Interaction   0.007952
j     3PAr Interaction   0.374706
k     Threshold Scoring   -0.181891
l   sqrt(AST%*TRB%)   0.239862


What are the issues with OBPM?

First and foremost, OBPM is based on regression, and unfortunately, this means that the regression can lead to some seriously biased results. Regression is really designed to only interpolate data, rather than extrapolate, as we technically don't know the distribution of data outside of the sample range. So automatically, once people start approaching outlier status in any of the individual terms, oddball things can happen, and OBPM can be severely over/underestimated.

One must consider that extrapolation happens quite a lot - the sample size is from 2001-2014, and any data pre-2000 or post-2014 is prone to extrapolation issues.

I'll talk about some variables of interest -

3PAr Interaction - this dataset was compiled for the 2001-2014 NBA landscape, where the use of the 3 point shot gained more prominence. A player who attempts a lot of 3 pointers before 3 pointers became a thing might be heavily boosted (e.g. Antoine Walker) and a player that doesn't attempt many 3s in the post 2014 era might be unfairly handicapped.

It's also worth noting that 3PAr is also partially meant to act as a proxy for spacing, so players that are excellent floor spacers without shooting many 3s are likely to be underrated. This is especially true considering that floor spacers get less offensive boards. A guy like Walker would get boosted more for "floor spacing" than a guy like say, Nowitzki or Aldridge, and any sane person will tell you that is not the case.

USG*AST and AST*REB - so, these terms are direct multiplication terms, and as a result, extreme results here can throw the regression completely out of whack. The AST*REB term is especially problematic here - I don't have much of an issue with the USG*AST term, because it's not quite as large, and there's more pure merit behind an offensive player being a significant scorer and playmaker.

It's also worth noting that the USG% and AST% terms aren't actually wholly independent - a player taking more shots is not only going to increase his USG%, but the same amount of raw assists will also increase his AST% even if his playmaking hasn't actually improved.

The AST*REB term has a couple of problems -

The DRB term is actually negative, so players that accrue a lot of defensive rebounds are likely to get punished if they don't get a lot of assists too. On the other hand, players that do get a lot of assists can be rewarded if they usurp a lot of their teammates rebounds (please, no arguments about particular players here - that's not the purpose of this post/thread), and players that don't rebound quite as much can be unfairly punished too (e.g. Steve Nash).

In other words, as an interaction term, sometimes accruing more of one statistic actually leads to a lower OBPM, even when that statistic may not directly correlate with offensive play. And when the magnitude of the stat is SO high, this can lead to some silly outliers in both directions.

This is especially prominent now, because Russ * Harden are basically historic outliers here, and this props their OBPM up quite a lot.

Also, much like 3PAr, things such as assist% and rebound% have changed league wide over time, which may slightly blur historical comparisons. I don't think assist% and rebound% era changes alter BPM that much though, just worth mentioning on a technical level.

Threshold scoring - a player who is incredible at improving his teammates efficiency improves his teammate TS%, but depending on the other variables + the player's TS% itself, this might cause silly things to happen to the regression once again. Steve Nash is also probably handicapped here, IMO.

So in a nutshell, what BPM does is regress statistics against RAPM in order to create an optimal fit, and for this fit to be most "accurate" for as many players as possible, it requires some terms that might cause odd results elsewhere, e.g. the reaction terms with defensive rebounding.

Interactions between terms that are only implicitly involved in BPM - a player who takes a lot of long 2s will almost definitely get less ORB% at the expense of spacing, but only the latter is discerned in BPM. Assists to layups/dunks cause more turnovers than assists to jump shooters, but BPM doesn't differentiate between assist types. Contested and uncontested rebounds are treated the same (in particular, some poor defensive perimeter players might be overrated due to high rebound*assist numbers on DBPM, but that's for another day).

Perhaps some of this is better explained by taking proper examples. So I'll bring up some of the players asked about -

Reggie Miller - he was an incredibly efficient scorer (led the league in TS% a couple of times) and bombed 3s at a ridiculous rate. So he's probably a bit of a 3 point/TS% outlier and this may have inflated his value. Of course, we don't have prime RAPM for Reggie Miller, so I can't speak with utmost confidence. It's worth mentioning that Ray Allen's OBPM was 4.3 over the 14 year timeframe, and his ORAPM was 5.1, and given that he's often seen as an 90s era analogue to Ray Allen, perhaps he may have simply been underrated? Who knows.

Larry Bird
- he was primarily a floor spacer, and didn't really start taking any meaningful number of 3s until 1986, so a lot of his floor spacing value probably wasn't captured. Interestingly enough, Bird always had a very high DBPM, so I wonder if some of the regression favoured stats (e.g. defensive rebounding) that likely inflated his DBPM wound up marginalising his OBPM.

Dirk - Dirk is one of the poster boys for being underrated by BPM. He spaces the floor without shooting many 3s (so his ORB% drops too), he was a fairly good defensive rebounder but didn't accrue many assists (and the assists handicap two of Dirk's interaction terms) and his gravity can only really be measured by team TS% and the team adjustment - so it's possible that Dirk's team TS% actually marginalises the impact that Dirk's personal TS% would normally have.

There weren't really any good examples in that leaderboard list, but Stephon Marbury was an example of a guy heavily overrated by OBPM based on his assists. He almost never passed to the interior, so his TOV% wasn't too high, but his assists (which he got heavily credit for) were nigh on worthless on a team success level.

Honestly, this could take all night to do properly, so I think this might be a good idea: In the article that I linked at the start of this post, feel free to look at the Tableau graphic and observe what players seem to be overrated/underrated to you, and try and make some judgments on what might be the case in your opinion. If you're struggling (or if you want verification on your thoughts), feel free to PM me and I'll throw my 2c in on the matter.

Hope that some of this helped!


Yes it did! Thanks man was great to read
“These guys have been criticized the last few years for not getting to where we’re going, but I’ve always said that the most important thing in sports is to keep trying. Let this be an example of what it means to say it’s never over.” - Jerry Sloan
User avatar
Texas Chuck
Senior Mod - NBA TnT Forum
Senior Mod - NBA TnT Forum
Posts: 85,511
And1: 88,346
Joined: May 19, 2012
Location: Purgatory
   

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#6 » by Texas Chuck » Mon Mar 20, 2017 3:04 pm

Can we waive the 5 year mandatory HoF waiting period for Bad Gatorade?
ThunderBolt wrote:I’m going to let some of you in on a little secret I learned on realgm. If you don’t like a thread, not only do you not have to comment but you don’t even have to open it and read it. You’re welcome.
mischievous
General Manager
Posts: 7,675
And1: 3,482
Joined: Apr 18, 2015

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#7 » by mischievous » Mon Mar 20, 2017 3:13 pm

This seems quite arbitrary, box score offensive stats don't do guys like Nash and Dirk justice.
colts18
Head Coach
Posts: 7,428
And1: 3,237
Joined: Jun 29, 2009

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#8 » by colts18 » Mon Mar 20, 2017 3:15 pm

If you look at the 2001-2014 RAPM which doesn't use box score, Ray Allen is ranked #10. Peja is ranked #11. It's not crazy to think that Miller was a really good offensive player.


http://public.tableau.com/views/14YearRAPM/14YearRAPM?:embed=y&:showVizHome=no
http://public.tableau.com/views/14YearRAPM/14YearRAPM?:embed=y&:showVizHome=no
User avatar
Texas Chuck
Senior Mod - NBA TnT Forum
Senior Mod - NBA TnT Forum
Posts: 85,511
And1: 88,346
Joined: May 19, 2012
Location: Purgatory
   

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#9 » by Texas Chuck » Mon Mar 20, 2017 3:19 pm

colts18 wrote:It's not crazy to think that Miller was a really good offensive player.





I'm almost certain no one is disputing Reggie Miller as being a very good offensive player. I think rightfully so some are questioning how valuable OBPM is in telling us that tho.
ThunderBolt wrote:I’m going to let some of you in on a little secret I learned on realgm. If you don’t like a thread, not only do you not have to comment but you don’t even have to open it and read it. You’re welcome.
User avatar
Senior
Sixth Man
Posts: 1,819
And1: 3,668
Joined: Jan 29, 2013

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#10 » by Senior » Mon Mar 20, 2017 3:20 pm

Bad Gatorade wrote:Okay, so, a brief rundown of the stat can be found here:

http://www.basketball-reference.com/about/bpm.html

Basically, BPM is an attempt at regressing historical box score stats onto a large RAPM sample in order to predict RAPM.

It's a pretty good attempt at a stat, but it's integral to understand the nooks and crannies of the stat in order to understand why some players might appear higher/lower than they probably should:

The formula, as shown in the article is this:

a*ReMPG + b*ORB% + c*DRB% + d*STL% + e*BLK% + f*AST% - g*USG%*TO% + h*USG%*(1-TO%)*[2*(TS% - TmTS%) + i*AST% + j*(3PAr - Lg3PAr) - k] + l*sqrt(AST%*TRB%)

And the values for the offensive portion are this:

Code: Select all

Coeff.   Term   O/D BPM Value
a   Regr. MPG   0.064448
b   ORB%   0.211125
c   DRB%   -0.107545
d   STL%   0.346513
e   BLK%   -0.052476
f   AST%   -0.041787
g   TO%*USG%   0.932965
h   Scoring   0.687359
i     AST Interaction   0.007952
j     3PAr Interaction   0.374706
k     Threshold Scoring   -0.181891
l   sqrt(AST%*TRB%)   0.239862


What are the issues with OBPM?

First and foremost, OBPM is based on regression, and unfortunately, this means that the regression can lead to some seriously biased results. Regression is really designed to only interpolate data, rather than extrapolate, as we technically don't know the distribution of data outside of the sample range. So automatically, once people start approaching outlier status in any of the individual terms, oddball things can happen, and OBPM can be severely over/underestimated.

One must consider that extrapolation happens quite a lot - the sample size is from 2001-2014, and any data pre-2000 or post-2014 is prone to extrapolation issues.

I'll talk about some variables of interest -

3PAr Interaction - this dataset was compiled for the 2001-2014 NBA landscape, where the use of the 3 point shot gained more prominence. A player who attempts a lot of 3 pointers before 3 pointers became a thing might be heavily boosted (e.g. Antoine Walker) and a player that doesn't attempt many 3s in the post 2014 era might be unfairly handicapped.

It's also worth noting that 3PAr is also partially meant to act as a proxy for spacing, so players that are excellent floor spacers without shooting many 3s are likely to be underrated. This is especially true considering that floor spacers get less offensive boards. A guy like Walker would get boosted more for "floor spacing" than a guy like say, Nowitzki or Aldridge, and any sane person will tell you that is not the case.

USG*AST and AST*REB - so, these terms are direct multiplication terms, and as a result, extreme results here can throw the regression completely out of whack. The AST*REB term is especially problematic here - I don't have much of an issue with the USG*AST term, because it's not quite as large, and there's more pure merit behind an offensive player being a significant scorer and playmaker.

It's also worth noting that the USG% and AST% terms aren't actually wholly independent - a player taking more shots is not only going to increase his USG%, but the same amount of raw assists will also increase his AST% even if his playmaking hasn't actually improved.

The AST*REB term has a couple of problems -

The DRB term is actually negative, so players that accrue a lot of defensive rebounds are likely to get punished if they don't get a lot of assists too. On the other hand, players that do get a lot of assists can be rewarded if they usurp a lot of their teammates rebounds (please, no arguments about particular players here - that's not the purpose of this post/thread), and players that don't rebound quite as much can be unfairly punished too (e.g. Steve Nash).

In other words, as an interaction term, sometimes accruing more of one statistic actually leads to a lower OBPM, even when that statistic may not directly correlate with offensive play. And when the magnitude of the stat is SO high, this can lead to some silly outliers in both directions.

This is especially prominent now, because Russ * Harden are basically historic outliers here, and this props their OBPM up quite a lot.

Also, much like 3PAr, things such as assist% and rebound% have changed league wide over time, which may slightly blur historical comparisons. I don't think assist% and rebound% era changes alter BPM that much though, just worth mentioning on a technical level.

Threshold scoring - a player who is incredible at improving his teammates efficiency improves his teammate TS%, but depending on the other variables + the player's TS% itself, this might cause silly things to happen to the regression once again. Steve Nash is also probably handicapped here, IMO.

So in a nutshell, what BPM does is regress statistics against RAPM in order to create an optimal fit, and for this fit to be most "accurate" for as many players as possible, it requires some terms that might cause odd results elsewhere, e.g. the reaction terms with defensive rebounding.

Interactions between terms that are only implicitly involved in BPM - a player who takes a lot of long 2s will almost definitely get less ORB% at the expense of spacing, but only the latter is discerned in BPM. Assists to layups/dunks cause more turnovers than assists to jump shooters, but BPM doesn't differentiate between assist types. Contested and uncontested rebounds are treated the same (in particular, some poor defensive perimeter players might be overrated due to high rebound*assist numbers on DBPM, but that's for another day).


In addition to all of this, DBPM is just overall BPM - OBPM, so if someone's OBPM is thrown off then their DBPM will be inaccurate as well. In the end, BPM doesn't take into account non box-score stuff, so using even using it to estimate defensive impact is just missing the point.

Stuff like lots of assumptions (like the spacing one) or arbitrary weights usually make me back off a stat - if you're using a stat with arbitrary weights like WS or PER you better have a good justification for those weights. Understanding what goes into box score derived stats like PER/WS/VORP/whatever is critical, as well as deciding which stats make sense to use and which don't. Does your stat actually measure what it proposes to? How good is it at measuring the stat? Does anything in the stat seem weird and how much does it affect the numbers it spits out? And in the end - just how good is the box-score at measuring how good a player is or what he's doing for his team? If the box-score doesn't even measure that much of a player's ability (which I personally don't think it does) then adding arbitrary weights to a stat is going to make the problem worse.

edit: here's a few posts on Hakeem's 1995 Playoffs OBPM viewtopic.php?f=64&t=1456683
User avatar
BenoUdrihFTL
RealGM
Posts: 10,701
And1: 23,486
Joined: Feb 20, 2013
Location: Papa John's
 

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#11 » by BenoUdrihFTL » Mon Mar 20, 2017 4:55 pm

Not surprised to see Reggie so high (career 120+ ORtg) but I'm pleasantly surprised to see Drexler top 10. 2 of my 3 favorite players all-time
1.61803398874989484820458683436563811772030917980576286
2135448622705260462818902449707207
204189391137484754088
0753868917521
26633862
22353
693
sp6r=underrated
RealGM
Posts: 17,197
And1: 8,517
Joined: Jan 20, 2007
 

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#12 » by sp6r=underrated » Mon Mar 20, 2017 5:17 pm

Bad Gatorade wrote:
Joao Saraiva wrote:I've never studied BPM a lot I don't remember how it was calculated. So I'm interested in reading somebody who is high on that stat to see what's up with the formula that gives these results.


Okay, so, a brief rundown of the stat can be found here:

http://www.basketball-reference.com/about/bpm.html

Basically, BPM is an attempt at regressing historical box score stats onto a large RAPM sample in order to predict RAPM.

It's a pretty good attempt at a stat, but it's integral to understand the nooks and crannies of the stat in order to understand why some players might appear higher/lower than they probably should:

The formula, as shown in the article is this:

a*ReMPG + b*ORB% + c*DRB% + d*STL% + e*BLK% + f*AST% - g*USG%*TO% + h*USG%*(1-TO%)*[2*(TS% - TmTS%) + i*AST% + j*(3PAr - Lg3PAr) - k] + l*sqrt(AST%*TRB%)

And the values for the offensive portion are this:

Code: Select all

Coeff.   Term   O/D BPM Value
a   Regr. MPG   0.064448
b   ORB%   0.211125
c   DRB%   -0.107545
d   STL%   0.346513
e   BLK%   -0.052476
f   AST%   -0.041787
g   TO%*USG%   0.932965
h   Scoring   0.687359
i     AST Interaction   0.007952
j     3PAr Interaction   0.374706
k     Threshold Scoring   -0.181891
l   sqrt(AST%*TRB%)   0.239862


What are the issues with OBPM?

First and foremost, OBPM is based on regression, and unfortunately, this means that the regression can lead to some seriously biased results. Regression is really designed to only interpolate data, rather than extrapolate, as we technically don't know the distribution of data outside of the sample range. So automatically, once people start approaching outlier status in any of the individual terms, oddball things can happen, and OBPM can be severely over/underestimated.

One must consider that extrapolation happens quite a lot - the sample size is from 2001-2014, and any data pre-2000 or post-2014 is prone to extrapolation issues.

I'll talk about some variables of interest -

3PAr Interaction - this dataset was compiled for the 2001-2014 NBA landscape, where the use of the 3 point shot gained more prominence. A player who attempts a lot of 3 pointers before 3 pointers became a thing might be heavily boosted (e.g. Antoine Walker) and a player that doesn't attempt many 3s in the post 2014 era might be unfairly handicapped.

It's also worth noting that 3PAr is also partially meant to act as a proxy for spacing, so players that are excellent floor spacers without shooting many 3s are likely to be underrated. This is especially true considering that floor spacers get less offensive boards. A guy like Walker would get boosted more for "floor spacing" than a guy like say, Nowitzki or Aldridge, and any sane person will tell you that is not the case.

USG*AST and AST*REB - so, these terms are direct multiplication terms, and as a result, extreme results here can throw the regression completely out of whack. The AST*REB term is especially problematic here - I don't have much of an issue with the USG*AST term, because it's not quite as large, and there's more pure merit behind an offensive player being a significant scorer and playmaker.

It's also worth noting that the USG% and AST% terms aren't actually wholly independent - a player taking more shots is not only going to increase his USG%, but the same amount of raw assists will also increase his AST% even if his playmaking hasn't actually improved.

The AST*REB term has a couple of problems -

The DRB term is actually negative, so players that accrue a lot of defensive rebounds are likely to get punished if they don't get a lot of assists too. On the other hand, players that do get a lot of assists can be rewarded if they usurp a lot of their teammates rebounds (please, no arguments about particular players here - that's not the purpose of this post/thread), and players that don't rebound quite as much can be unfairly punished too (e.g. Steve Nash).

In other words, as an interaction term, sometimes accruing more of one statistic actually leads to a lower OBPM, even when that statistic may not directly correlate with offensive play. And when the magnitude of the stat is SO high, this can lead to some silly outliers in both directions.

This is especially prominent now, because Russ * Harden are basically historic outliers here, and this props their OBPM up quite a lot.

Also, much like 3PAr, things such as assist% and rebound% have changed league wide over time, which may slightly blur historical comparisons. I don't think assist% and rebound% era changes alter BPM that much though, just worth mentioning on a technical level.

Threshold scoring - a player who is incredible at improving his teammates efficiency improves his teammate TS%, but depending on the other variables + the player's TS% itself, this might cause silly things to happen to the regression once again. Steve Nash is also probably handicapped here, IMO.

So in a nutshell, what BPM does is regress statistics against RAPM in order to create an optimal fit, and for this fit to be most "accurate" for as many players as possible, it requires some terms that might cause odd results elsewhere, e.g. the reaction terms with defensive rebounding.

Interactions between terms that are only implicitly involved in BPM - a player who takes a lot of long 2s will almost definitely get less ORB% at the expense of spacing, but only the latter is discerned in BPM. Assists to layups/dunks cause more turnovers than assists to jump shooters, but BPM doesn't differentiate between assist types. Contested and uncontested rebounds are treated the same (in particular, some poor defensive perimeter players might be overrated due to high rebound*assist numbers on DBPM, but that's for another day).

Perhaps some of this is better explained by taking proper examples. So I'll bring up some of the players asked about -

Reggie Miller - he was an incredibly efficient scorer (led the league in TS% a couple of times) and bombed 3s at a ridiculous rate. So he's probably a bit of a 3 point/TS% outlier and this may have inflated his value. Of course, we don't have prime RAPM for Reggie Miller, so I can't speak with utmost confidence. It's worth mentioning that Ray Allen's OBPM was 4.3 over the 14 year timeframe, and his ORAPM was 5.1, and given that he's often seen as an 90s era analogue to Ray Allen, perhaps he may have simply been underrated? Who knows.

Larry Bird
- he was primarily a floor spacer, and didn't really start taking any meaningful number of 3s until 1986, so a lot of his floor spacing value probably wasn't captured. Interestingly enough, Bird always had a very high DBPM, so I wonder if some of the regression favoured stats (e.g. defensive rebounding) that likely inflated his DBPM wound up marginalising his OBPM.

Dirk - Dirk is one of the poster boys for being underrated by BPM. He spaces the floor without shooting many 3s (so his ORB% drops too), he was a fairly good defensive rebounder but didn't accrue many assists (and the assists handicap two of Dirk's interaction terms) and his gravity can only really be measured by team TS% and the team adjustment - so it's possible that Dirk's team TS% actually marginalises the impact that Dirk's personal TS% would normally have.

There weren't really any good examples in that leaderboard list, but Stephon Marbury was an example of a guy heavily overrated by OBPM based on his assists. He almost never passed to the interior, so his TOV% wasn't too high, but his assists (which he got heavily credit for) were nigh on worthless on a team success level.

Honestly, this could take all night to do properly, so I think this might be a good idea: In the article that I linked at the start of this post, feel free to look at the Tableau graphic and observe what players seem to be overrated/underrated to you, and try and make some judgments on what might be the case in your opinion. If you're struggling (or if you want verification on your thoughts), feel free to PM me and I'll throw my 2c in on the matter.

Hope that some of this helped!


Image

does realgm offer salaries for posters? if so, time to break out the big realgm bucks for Bad Gatorade.
penbeast0
Senior Mod - NBA Player Comparisons
Senior Mod - NBA Player Comparisons
Posts: 28,313
And1: 8,584
Joined: Aug 14, 2004
Location: South Florida
 

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#13 » by penbeast0 » Mon Mar 20, 2017 10:18 pm

Harden having 6 elite seasons already actually shocked me more than Reggie ranking so high in a stat that so heavily values 3 point volume. Almost surprised that Jordan has that many too. Almost.
“Most people use statistics like a drunk man uses a lamppost; more for support than illumination,” Andrew Lang.
HeartBreakKid
RealGM
Posts: 22,395
And1: 18,813
Joined: Mar 08, 2012
     

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#14 » by HeartBreakKid » Mon Mar 20, 2017 10:24 pm

Is there any offensive advance stat that Reggie Miller doesn't place very well in?
penbeast0
Senior Mod - NBA Player Comparisons
Senior Mod - NBA Player Comparisons
Posts: 28,313
And1: 8,584
Joined: Aug 14, 2004
Location: South Florida
 

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#15 » by penbeast0 » Mon Mar 20, 2017 10:25 pm

OReb %
“Most people use statistics like a drunk man uses a lamppost; more for support than illumination,” Andrew Lang.
Colbinii
RealGM
Posts: 31,383
And1: 19,568
Joined: Feb 13, 2013

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#16 » by Colbinii » Mon Mar 20, 2017 11:14 pm

HeartBreakKid wrote:Is there any offensive advance stat that Reggie Miller doesn't place very well in?


AST%
tsherkin wrote:Locked due to absence of adult conversation.

penbeast0 wrote:Guys, if you don't have anything to say, don't post.


Circa 2018
E-Balla wrote:LeBron is Jeff George.


Circa 2022
G35 wrote:Lebron is not that far off from WB in trade value.
User avatar
homecourtloss
RealGM
Posts: 10,668
And1: 17,568
Joined: Dec 29, 2012

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#17 » by homecourtloss » Tue Mar 21, 2017 1:24 am

Bad Gatorade wrote:
Joao Saraiva wrote:I've never studied BPM a lot I don't remember how it was calculated. So I'm interested in reading somebody who is high on that stat to see what's up with the formula that gives these results.


Okay, so, a brief rundown of the stat can be found here:

http://www.basketball-reference.com/about/bpm.html

Basically, BPM is an attempt at regressing historical box score stats onto a large RAPM sample in order to predict RAPM.

It's a pretty good attempt at a stat, but it's integral to understand the nooks and crannies of the stat in order to understand why some players might appear higher/lower than they probably should:

The formula, as shown in the article is this:

a*ReMPG + b*ORB% + c*DRB% + d*STL% + e*BLK% + f*AST% - g*USG%*TO% + h*USG%*(1-TO%)*[2*(TS% - TmTS%) + i*AST% + j*(3PAr - Lg3PAr) - k] + l*sqrt(AST%*TRB%)

And the values for the offensive portion are this:

Code: Select all

Coeff.   Term   O/D BPM Value
a   Regr. MPG   0.064448
b   ORB%   0.211125
c   DRB%   -0.107545
d   STL%   0.346513
e   BLK%   -0.052476
f   AST%   -0.041787
g   TO%*USG%   0.932965
h   Scoring   0.687359
i     AST Interaction   0.007952
j     3PAr Interaction   0.374706
k     Threshold Scoring   -0.181891
l   sqrt(AST%*TRB%)   0.239862


What are the issues with OBPM?

First and foremost, OBPM is based on regression, and unfortunately, this means that the regression can lead to some seriously biased results. Regression is really designed to only interpolate data, rather than extrapolate, as we technically don't know the distribution of data outside of the sample range. So automatically, once people start approaching outlier status in any of the individual terms, oddball things can happen, and OBPM can be severely over/underestimated.

One must consider that extrapolation happens quite a lot - the sample size is from 2001-2014, and any data pre-2000 or post-2014 is prone to extrapolation issues.

I'll talk about some variables of interest -

3PAr Interaction - this dataset was compiled for the 2001-2014 NBA landscape, where the use of the 3 point shot gained more prominence. A player who attempts a lot of 3 pointers before 3 pointers became a thing might be heavily boosted (e.g. Antoine Walker) and a player that doesn't attempt many 3s in the post 2014 era might be unfairly handicapped.

It's also worth noting that 3PAr is also partially meant to act as a proxy for spacing, so players that are excellent floor spacers without shooting many 3s are likely to be underrated. This is especially true considering that floor spacers get less offensive boards. A guy like Walker would get boosted more for "floor spacing" than a guy like say, Nowitzki or Aldridge, and any sane person will tell you that is not the case.

USG*AST and AST*REB - so, these terms are direct multiplication terms, and as a result, extreme results here can throw the regression completely out of whack. The AST*REB term is especially problematic here - I don't have much of an issue with the USG*AST term, because it's not quite as large, and there's more pure merit behind an offensive player being a significant scorer and playmaker.

It's also worth noting that the USG% and AST% terms aren't actually wholly independent - a player taking more shots is not only going to increase his USG%, but the same amount of raw assists will also increase his AST% even if his playmaking hasn't actually improved.

The AST*REB term has a couple of problems -

The DRB term is actually negative, so players that accrue a lot of defensive rebounds are likely to get punished if they don't get a lot of assists too. On the other hand, players that do get a lot of assists can be rewarded if they usurp a lot of their teammates rebounds (please, no arguments about particular players here - that's not the purpose of this post/thread), and players that don't rebound quite as much can be unfairly punished too (e.g. Steve Nash).

In other words, as an interaction term, sometimes accruing more of one statistic actually leads to a lower OBPM, even when that statistic may not directly correlate with offensive play. And when the magnitude of the stat is SO high, this can lead to some silly outliers in both directions.

This is especially prominent now, because Russ * Harden are basically historic outliers here, and this props their OBPM up quite a lot.

Also, much like 3PAr, things such as assist% and rebound% have changed league wide over time, which may slightly blur historical comparisons. I don't think assist% and rebound% era changes alter BPM that much though, just worth mentioning on a technical level.

Threshold scoring - a player who is incredible at improving his teammates efficiency improves his teammate TS%, but depending on the other variables + the player's TS% itself, this might cause silly things to happen to the regression once again. Steve Nash is also probably handicapped here, IMO.

So in a nutshell, what BPM does is regress statistics against RAPM in order to create an optimal fit, and for this fit to be most "accurate" for as many players as possible, it requires some terms that might cause odd results elsewhere, e.g. the reaction terms with defensive rebounding.

Interactions between terms that are only implicitly involved in BPM - a player who takes a lot of long 2s will almost definitely get less ORB% at the expense of spacing, but only the latter is discerned in BPM. Assists to layups/dunks cause more turnovers than assists to jump shooters, but BPM doesn't differentiate between assist types. Contested and uncontested rebounds are treated the same (in particular, some poor defensive perimeter players might be overrated due to high rebound*assist numbers on DBPM, but that's for another day).

Perhaps some of this is better explained by taking proper examples. So I'll bring up some of the players asked about -

Reggie Miller - he was an incredibly efficient scorer (led the league in TS% a couple of times) and bombed 3s at a ridiculous rate. So he's probably a bit of a 3 point/TS% outlier and this may have inflated his value. Of course, we don't have prime RAPM for Reggie Miller, so I can't speak with utmost confidence. It's worth mentioning that Ray Allen's OBPM was 4.3 over the 14 year timeframe, and his ORAPM was 5.1, and given that he's often seen as an 90s era analogue to Ray Allen, perhaps he may have simply been underrated? Who knows.

Larry Bird
- he was primarily a floor spacer, and didn't really start taking any meaningful number of 3s until 1986, so a lot of his floor spacing value probably wasn't captured. Interestingly enough, Bird always had a very high DBPM, so I wonder if some of the regression favoured stats (e.g. defensive rebounding) that likely inflated his DBPM wound up marginalising his OBPM.

Dirk - Dirk is one of the poster boys for being underrated by BPM. He spaces the floor without shooting many 3s (so his ORB% drops too), he was a fairly good defensive rebounder but didn't accrue many assists (and the assists handicap two of Dirk's interaction terms) and his gravity can only really be measured by team TS% and the team adjustment - so it's possible that Dirk's team TS% actually marginalises the impact that Dirk's personal TS% would normally have.

There weren't really any good examples in that leaderboard list, but Stephon Marbury was an example of a guy heavily overrated by OBPM based on his assists. He almost never passed to the interior, so his TOV% wasn't too high, but his assists (which he got heavily credit for) were nigh on worthless on a team success level.

Honestly, this could take all night to do properly, so I think this might be a good idea: In the article that I linked at the start of this post, feel free to look at the Tableau graphic and observe what players seem to be overrated/underrated to you, and try and make some judgments on what might be the case in your opinion. If you're struggling (or if you want verification on your thoughts), feel free to PM me and I'll throw my 2c in on the matter.

Hope that some of this helped!


BG should be a moderator
lessthanjake wrote:Kyrie was extremely impactful without LeBron, and basically had zero impact whatsoever if LeBron was on the court.

lessthanjake wrote: By playing in a way that prevents Kyrie from getting much impact, LeBron ensures that controlling for Kyrie has limited effect…
rcontador
Junior
Posts: 349
And1: 165
Joined: May 08, 2012

Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#18 » by rcontador » Tue Mar 21, 2017 5:17 am

Bad Gatorade wrote:snip


Thanks for this terrific post.

I have to say, my main response to learning about BPM is, "Wow, BPM is an unbelievably garbage stat."
Colbinii
RealGM
Posts: 31,383
And1: 19,568
Joined: Feb 13, 2013

Re: RE: Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#19 » by Colbinii » Tue Mar 21, 2017 12:10 pm

rcontador wrote:
Bad Gatorade wrote:snip


Thanks for this terrific post.

I have to say, my main response to learning about BPM is, "Wow, BPM is an unbelievably garbage stat."


It isn't. Like any stat, it shouldn't be alone. Like any stat, we should know what it means when using it.
tsherkin wrote:Locked due to absence of adult conversation.

penbeast0 wrote:Guys, if you don't have anything to say, don't post.


Circa 2018
E-Balla wrote:LeBron is Jeff George.


Circa 2022
G35 wrote:Lebron is not that far off from WB in trade value.
User avatar
Ryoga Hibiki
RealGM
Posts: 11,134
And1: 6,508
Joined: Nov 14, 2001
Location: Warszawa now, but from Northern Italy

Re: RE: Re: Offensive primes using OBPM (Reggie Miller #3 all time) 

Post#20 » by Ryoga Hibiki » Tue Mar 21, 2017 6:42 pm

Colbinii wrote: It isn't. Like any stat, it shouldn't be alone. Like any stat, we should know what it means when using it.

The problem is that this kind of all in one stats, like PER, are not measuring anything cdefined and that makes them too noisy.
I know, for instance, what ts% is telling me and I can intergrate it with the eye test and other stats.
Once I get an all in one stat like OBPM with so many things adding up what should I do to integrate it with other info? I would probably need to break it the way bad gatorade did, but then it wouldn't be worth the effort.
Слава Украине!

Return to Player Comparisons