2017-18 RAPM/RPM/etc. Thread

Moderators: Clyde Frazier, Doctor MJ, trex_8063, penbeast0, PaulieWal

User avatar
SideshowBob
General Manager
Posts: 9,061
And1: 6,262
Joined: Jul 16, 2010
Location: Washington DC
 

2017-18 RAPM/RPM/etc. Thread 

Post#1 » by SideshowBob » Fri Dec 1, 2017 1:24 pm

2017 - 2018 RAPM | APBR / shadow | 1-Year / Zero Prior

2017 - 2018 RPM | ESPN / J.E. | 1-Year / Box-Score Prior

------------------------------------------------------------------------

The user shadow on APBRmetrics will be frequently updating his RAPM spreadsheet throughout the year. He/she will be including the date of the most recent update.

The parameter's for shadow's RAPM are:

    Single Year
    No/Zero Prior

Also mentioned the following:

Yes, it's single-year, zero prior. I guess the term most people used in the past was 'vanilla RAPM'.

Total RAPM is just ORAPM - DRAPM to identify who are the best over or under players when it comes to projecting total points scored. LeBron ranks number one in Total RAPM at the moment as the best 'over' player because he's 5th in offense and 443rd on defense.

I probably will only update the file once every few weeks. I'll add a field indicating the date of the most recent update.


------------------------------------------------------------------------

+/- Family Primer

Spoiler:
[#1]APM is Simple OLS. Set up every 5on5 matchup, set equal to the scoring margin and solve for each player across the league (I've run it for a few years, its like ~65000 lines of 5on5 matchups). The resulting coefficients (on each player) are the APM values. This needs a very large sample size to say anything of considerable meaning; a single-year APM has large error terms on each coefficient, multi-year (usually 2-year) studies are preferred.

RAPM is essentially the same thing (OLS) with one exception. It introduces what we'll call a "reference matrix", basically each player is given a baseline value, towards which their coefficient will be pulled. I believe this tries to reduce the multicollinearity problem.

In [#2A]vanilla/basic RAPM, every value in the reference matrix is set to 0. The greater the amount of games played, the less weight that reference of 0 has. It is almost the same as APM [#1], but the regression towards 0 in theory reduces the error within a single-season set. It's still fairly volatile, but it's better than APM [#1] is within a single year. There is also [#2B]multi-year RAPM, which just uses a larger number of seasons, with most weight given to the current year and less and less weight given to previous years, reference matrix of 0s.

[#3]Prior-informed RAPM is essentially the best (ITO out-of-sample prediction) version of this family without introducing the box-score. It's built the same way as RAPM, but the reference matrix uses RAPM values from the previous year, instead of all players being set at 0. Again, as the sample size of the season grows, the reference value holds less and less weight. Obviously this only works once we have multiple years of data, in the 1st year, there is no prior. In the 2nd, we have a prior but it is vanilla RAPM [#2A], but by the 3rd year we can use PI RAPM of the previous year to inform the current year.

[#4A]RPM is RAPM, but the reference matrix is made up of SPM values (SPM is again, regression of box-score metrics on a multi-year non-box-score model such as RAPM). There is also [#4B]multi-year RPM, which is the same as multi-year RAPM, except it presumably uses a reference matrix of multi-year SPM values.

There is also [#5]prior informed RPM. Again, same idea as PI RAPM [#3], single-year, reference matrix of prior-year's RPM values.
But in his home dwelling...the hi-top faded warrior is revered. *Smack!* The sound of his palm blocking the basketball... the sound of thousands rising, roaring... the sound of "get that sugar honey iced tea outta here!"
User avatar
K_chile22
RealGM
Posts: 16,354
And1: 8,333
Joined: Jul 15, 2015
   

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#2 » by K_chile22 » Fri Dec 1, 2017 3:41 pm

Either the sample is way too small or the number is just wonky. Ariza has the 3rd highest ORPM with quite a bit of cushion, OG Anunoby is fifth, Taj is 11th, Galloway has the second best DRPM and Eric Gordon the fourth
User avatar
Zeitgeister
General Manager
Posts: 8,547
And1: 6,870
Joined: Nov 11, 2008
   

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#3 » by Zeitgeister » Fri Dec 1, 2017 3:56 pm

K_chile22 wrote:Either the sample is way too small or the number is just wonky. Ariza has the 3rd highest ORPM with quite a bit of cushion, OG Anunoby is fifth, Taj is 11th, Galloway has the second best DRPM and Eric Gordon the fourth


It's going to be a very noisy sample size this early but I'm not at all surprised with where Taj Gibson is at, he's been the best player on the Wolves.
Lenin wrote: All over the world, wherever there are capitalists, freedom of the press means freedom to buy up newspapers, to buy writers, to bribe, buy and fake "public opinion" for the benefit of the bourgeoisie.
User avatar
yoyoboy
RealGM
Posts: 15,825
And1: 19,041
Joined: Jan 29, 2015
     

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#4 » by yoyoboy » Fri Dec 1, 2017 5:17 pm

LeBron's defense. :(

443rd out of 446 players on defense...with a -1.57, only better than JR Smith, De'Aaron Fox, and Wesley Matthews.

For comparison, just 2 seasons ago (2015-16) in single-year RAPM, he finished 4th out of 480 players on defense with a +2.89, only behind Draymond Green, Tim Duncan, Kawhi Leonard, and Tony Snell.

Then he was 4th from the top. And now he's 4th from the bottom.
WoogieBoo
Ballboy
Posts: 36
And1: 23
Joined: Mar 05, 2017
   

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#5 » by WoogieBoo » Fri Dec 1, 2017 5:22 pm

Last updated november 21st, Bron will rise considerably when shadow decides to update..
Krodis
Lead Assistant
Posts: 4,876
And1: 599
Joined: Nov 28, 2009

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#6 » by Krodis » Fri Dec 1, 2017 5:33 pm

Harden has a pretty enormous lead in RPM so far.

Sent from my SM-N910V using RealGM mobile app
User avatar
SideshowBob
General Manager
Posts: 9,061
And1: 6,262
Joined: Jul 16, 2010
Location: Washington DC
 

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#7 » by SideshowBob » Fri Dec 1, 2017 5:48 pm

Krodis wrote:Harden has a pretty enormous lead in RPM so far.

Sent from my SM-N910V using RealGM mobile app


Here's 2017 RPM at 25% of the regular season and the full season (RS + PS):

Dec 8th, 2016

April 15, 2017 (End of Regular Season)

RPM after Finals
But in his home dwelling...the hi-top faded warrior is revered. *Smack!* The sound of his palm blocking the basketball... the sound of thousands rising, roaring... the sound of "get that sugar honey iced tea outta here!"
ThePersianFreak
Suspended
Posts: 1,533
And1: 1,072
Joined: Nov 02, 2012

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#8 » by ThePersianFreak » Fri Dec 1, 2017 7:37 pm

yoyoboy wrote:LeBron's defense. :(

443rd out of 446 players on defense...with a -1.57, only better than JR Smith, De'Aaron Fox, and Wesley Matthews.

For comparison, just 2 seasons ago (2015-16) in single-year RAPM, he finished 4th out of 480 players on defense with a +2.89, only behind Draymond Green, Tim Duncan, Kawhi Leonard, and Tony Snell.

Then he was 4th from the top. And now he's 4th from the bottom.


His updated rank is 251.
dhsilv2
RealGM
Posts: 48,285
And1: 25,836
Joined: Oct 04, 2015

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#9 » by dhsilv2 » Sat Dec 2, 2017 1:01 am

Wish we had a RAPM with a RAPM prior instead of the box score prior.
thekdog34
Starter
Posts: 2,354
And1: 782
Joined: Jul 13, 2009
     

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#10 » by thekdog34 » Sat Dec 2, 2017 1:08 am

dhsilv2 wrote:Wish we had a RAPM with a RAPM prior instead of the box score prior.


Isn't that just multi-year RAPM?
dhsilv2
RealGM
Posts: 48,285
And1: 25,836
Joined: Oct 04, 2015

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#11 » by dhsilv2 » Sat Dec 2, 2017 1:25 am

thekdog34 wrote:
dhsilv2 wrote:Wish we had a RAPM with a RAPM prior instead of the box score prior.


Isn't that just multi-year RAPM?


I dunno anymore. I keep hearing different responses based on what I've read so I'll just not try anymore. I'm pretty sure if I hear no prior there's nothing else, but maybe that's wrong too.
User avatar
SideshowBob
General Manager
Posts: 9,061
And1: 6,262
Joined: Jul 16, 2010
Location: Washington DC
 

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#12 » by SideshowBob » Sat Dec 2, 2017 1:48 am

thekdog34 wrote:
dhsilv2 wrote:Wish we had a RAPM with a RAPM prior instead of the box score prior.


Isn't that just multi-year RAPM?


No they are distinct. Read my primer above.
But in his home dwelling...the hi-top faded warrior is revered. *Smack!* The sound of his palm blocking the basketball... the sound of thousands rising, roaring... the sound of "get that sugar honey iced tea outta here!"
User avatar
Jaivl
Head Coach
Posts: 6,958
And1: 6,601
Joined: Jan 28, 2014
Location: A Coruña, Spain
Contact:
   

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#13 » by Jaivl » Sat Dec 2, 2017 3:10 am

SideshowBob wrote:
thekdog34 wrote:
dhsilv2 wrote:Wish we had a RAPM with a RAPM prior instead of the box score prior.


Isn't that just multi-year RAPM?


No they are distinct. Read my primer above.

I thought it was clear that, for practical purposes, they are mostly equivalent.
This place is a cesspool of mindless ineptitude, mental decrepitude, and intellectual lassitude. I refuse to be sucked any deeper into this whirlpool of groupthink sewage. My opinions have been expressed. I'm going to go take a shower.
dhsilv2
RealGM
Posts: 48,285
And1: 25,836
Joined: Oct 04, 2015

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#14 » by dhsilv2 » Sat Dec 2, 2017 3:33 am

Jaivl wrote:
SideshowBob wrote:
thekdog34 wrote:
Isn't that just multi-year RAPM?


No they are distinct. Read my primer above.

I thought it was clear that, for practical purposes, they are mostly equivalent.


Prior inforned rapm is suppose to be better. clearly it has flaws though.
User avatar
SideshowBob
General Manager
Posts: 9,061
And1: 6,262
Joined: Jul 16, 2010
Location: Washington DC
 

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#15 » by SideshowBob » Sat Dec 2, 2017 3:40 am

Jaivl wrote:
SideshowBob wrote:
thekdog34 wrote:
Isn't that just multi-year RAPM?


No they are distinct. Read my primer above.

I thought it was clear that, for practical purposes, they are mostly equivalent.


That is what I have been told, but I still see them as distinct.

One is single-year data with a reversion to last year's values of the same stat (with the reversion factor or lambda determined by the individual).

The other is multi-year data with a reversion to zero, but here, in addition to the lambda, the weights of the prior years are also being selected by the individual running the data.

They can be set up to have similar/identical predictive power, but I would not call them the same.

Also, with regards to the lambda, I say determined, but a more proper term would be derived. It is not arbitrarily chosen.
But in his home dwelling...the hi-top faded warrior is revered. *Smack!* The sound of his palm blocking the basketball... the sound of thousands rising, roaring... the sound of "get that sugar honey iced tea outta here!"
Doctor MJ
Senior Mod
Senior Mod
Posts: 52,396
And1: 21,339
Joined: Mar 10, 2005
Location: Cali
     

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#16 » by Doctor MJ » Sat Dec 2, 2017 6:51 pm

dhsilv2 wrote:
Jaivl wrote:
SideshowBob wrote:
No they are distinct. Read my primer above.

I thought it was clear that, for practical purposes, they are mostly equivalent.


Prior inforned rapm is suppose to be better. clearly it has flaws though.


My take:

The issue with box score-priors is that it gives you a result that has ALREADY factored in the box score. While that's a good thing if you're trying to make "one stat to rule them all", it makes the stat less useful to an analyst trying to use many different tools to gain a more complete picture of what's happening.

To put a ranking to it:

Used in isolation:
1. "anything goes"-prior RAPM (RPM being the industry leader on that)
2. "prior-informed" RAPM (by which we mean a prior made using +/- stats)
3. Non-prior-informed RAPM
4. APM (the precursor to RAPM, does not use regularization)
5. Less sophisticated variations of +/- stats.

But when used in conjunction with all the other tools available to us, #2, 3 & 4 all easily leap ahead of #1 in my book. Publicly available +/- stats took a really damaging turn when they focused too much on the idea of independent optimization. The value of a stat was always dependent on the ability of the analyst to take meaning from the stats which could then be used to perform research on the questions it brought up, and when you muck up a stat like they've done, you turn it into a black box.
Getting ready for the RealGM 100 on the PC Board

Come join the WNBA Board if you're a fan!
dhsilv2
RealGM
Posts: 48,285
And1: 25,836
Joined: Oct 04, 2015

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#17 » by dhsilv2 » Sat Dec 2, 2017 7:34 pm

Doctor MJ wrote:
dhsilv2 wrote:
Jaivl wrote:I thought it was clear that, for practical purposes, they are mostly equivalent.


Prior inforned rapm is suppose to be better. clearly it has flaws though.


My take:

The issue with box score-priors is that it gives you a result that has ALREADY factored in the box score. While that's a good thing if you're trying to make "one stat to rule them all", it makes the stat less useful to an analyst trying to use many different tools to gain a more complete picture of what's happening.

To put a ranking to it:

Used in isolation:
1. "anything goes"-prior RAPM (RPM being the industry leader on that)
2. "prior-informed" RAPM (by which we mean a prior made using +/- stats)
3. Non-prior-informed RAPM
4. APM (the precursor to RAPM, does not use regularization)
5. Less sophisticated variations of +/- stats.

But when used in conjunction with all the other tools available to us, #2, 3 & 4 all easily leap ahead of #1 in my book. Publicly available +/- stats took a really damaging turn when they focused too much on the idea of independent optimization. The value of a stat was always dependent on the ability of the analyst to take meaning from the stats which could then be used to perform research on the questions it brought up, and when you muck up a stat like they've done, you turn it into a black box.


i think this is dead on. The litmus test of this stat has traditionally been if it was predictive of future results. Do you think that should be the objective?
Doctor MJ
Senior Mod
Senior Mod
Posts: 52,396
And1: 21,339
Joined: Mar 10, 2005
Location: Cali
     

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#18 » by Doctor MJ » Sat Dec 2, 2017 7:45 pm

dhsilv2 wrote:
Doctor MJ wrote:
dhsilv2 wrote:
Prior inforned rapm is suppose to be better. clearly it has flaws though.


My take:

The issue with box score-priors is that it gives you a result that has ALREADY factored in the box score. While that's a good thing if you're trying to make "one stat to rule them all", it makes the stat less useful to an analyst trying to use many different tools to gain a more complete picture of what's happening.

To put a ranking to it:

Used in isolation:
1. "anything goes"-prior RAPM (RPM being the industry leader on that)
2. "prior-informed" RAPM (by which we mean a prior made using +/- stats)
3. Non-prior-informed RAPM
4. APM (the precursor to RAPM, does not use regularization)
5. Less sophisticated variations of +/- stats.

But when used in conjunction with all the other tools available to us, #2, 3 & 4 all easily leap ahead of #1 in my book. Publicly available +/- stats took a really damaging turn when they focused too much on the idea of independent optimization. The value of a stat was always dependent on the ability of the analyst to take meaning from the stats which could then be used to perform research on the questions it brought up, and when you muck up a stat like they've done, you turn it into a black box.


i think this is dead on. The litmus test of this stat has traditionally been if it was predictive of future results. Do you think that should be the objective?


Nope, and I've been shouting this everywhere I can ever since the statistician in question (Englemann) started using box score-priors with what he called XRAPM before it eventually morphed in to Real Plus Minus when ESPN decided to make it the de facto +/- stat. Up until that point it didn't really matter that much whether a stat-maker was more focused on optimization than true utility, because improvements in the former also at least had arguments to be improvements in the latter. From that moment on though, it's been clear that at lot of important thinkers in the basketball analytics community had never really understand why they mattered in the first place.

But, of course, they've only become more important since then, and I'm just some dude. Whaddya gonna do? :dontknow:

I'm still very optimistic about the future at least in terms of what the NBA is using. I have no doubt franchises are going to use player tracking data along with techniques like regression (that all the APM stats are based on) and come up with ever superior proprietary tools. The question is about what the public is going to get.
Getting ready for the RealGM 100 on the PC Board

Come join the WNBA Board if you're a fan!
dhsilv2
RealGM
Posts: 48,285
And1: 25,836
Joined: Oct 04, 2015

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#19 » by dhsilv2 » Sat Dec 2, 2017 7:54 pm

Doctor MJ wrote:
dhsilv2 wrote:
Doctor MJ wrote:
My take:

The issue with box score-priors is that it gives you a result that has ALREADY factored in the box score. While that's a good thing if you're trying to make "one stat to rule them all", it makes the stat less useful to an analyst trying to use many different tools to gain a more complete picture of what's happening.

To put a ranking to it:

Used in isolation:
1. "anything goes"-prior RAPM (RPM being the industry leader on that)
2. "prior-informed" RAPM (by which we mean a prior made using +/- stats)
3. Non-prior-informed RAPM
4. APM (the precursor to RAPM, does not use regularization)
5. Less sophisticated variations of +/- stats.

But when used in conjunction with all the other tools available to us, #2, 3 & 4 all easily leap ahead of #1 in my book. Publicly available +/- stats took a really damaging turn when they focused too much on the idea of independent optimization. The value of a stat was always dependent on the ability of the analyst to take meaning from the stats which could then be used to perform research on the questions it brought up, and when you muck up a stat like they've done, you turn it into a black box.


i think this is dead on. The litmus test of this stat has traditionally been if it was predictive of future results. Do you think that should be the objective?


Nope, and I've been shouting this everywhere I can ever since the statistician in question (Englemann) started using box score-priors with what he called XRAPM before it eventually morphed in to Real Plus Minus when ESPN decided to make it the de facto +/- stat. Up until that point it didn't really matter that much whether a stat-maker was more focused on optimization than true utility, because improvements in the former also at least had arguments to be improvements in the latter. From that moment on though, it's been clear that at lot of important thinkers in the basketball analytics community had never really understand why they mattered in the first place.

But, of course, they've only become more important since then, and I'm just some dude. Whaddya gonna do? :dontknow:

I'm still very optimistic about the future at least in terms of what the NBA is using. I have no doubt franchises are going to use player tracking data along with techniques like regression (that all the APM stats are based on) and come up with ever superior proprietary tools. The question is about what the public is going to get.


The fact that it seems the nba is tracking a bigger box score than is being published has been rather striking to me in the last few years. Things like deflections don't seem to be from the player tracking data set, which I assume means they're being tracked and not listed in box scores. This all leads me to assume there will be future GREAT analytics. Of course we'll never get past the constant complaints about using numbers to explain a sport....because people hate math (and I'm talking the elementary school stuff, not even the high school stuff).

Anyway off the soap box. Why is future predictions not the goal of this metrics? Or would you rather they use this metric with another data set(s) and be transparent in how they are constructing them?

I ask as I'd *think* PPG is a reasonable metric for future points per game on a team. It's not ideal, but if I put 5 guys who scored 20 a game together, I'd expect to get more points per game than a team of 10 point per game scorers. With RAPM if I put a group of guys who are high in this metric, I'd expect a team that wins more than with lower guys. Without some way to test if the results we get are reasonable how do we test that we have meaningful results? Do you have a better test?

Agree that xRAPM is just double counting the box scores btw. It sorta creates a more expected data set while ignoring what I think is the power of RAPM in that it can identify guys who play roles with huge value that doesn't really register on the box score.
Doctor MJ
Senior Mod
Senior Mod
Posts: 52,396
And1: 21,339
Joined: Mar 10, 2005
Location: Cali
     

Re: 2017-18 RAPM/RPM/etc. Thread 

Post#20 » by Doctor MJ » Sat Dec 2, 2017 8:32 pm

dhsilv2 wrote:Anyway off the soap box. Why is future predictions not the goal of this metrics? Or would you rather they use this metric with another data set(s) and be transparent in how they are constructing them?


When people talk about prediction, they talk about using the stat in a vacuum...but no one uses the stats in a vacuum. I'm saying that the goal always needs to be about utility.

Now, it's easier to measure in a vacuum, and that means I actually like seeing analyses of how well metrics do in a vacuum. It's worth something, it's just not everything.

dhsilv2 wrote:I ask as I'd *think* PPG is a reasonable metric for future points per game on a team. It's not ideal, but if I put 5 guys who scored 20 a game together, I'd expect to get more points per game than a team of 10 point per game scorers. With RAPM if I put a group of guys who are high in this metric, I'd expect a team that wins more than with lower guys. Without some way to test if the results we get are reasonable how do we test that we have meaningful results? Do you have a better test?


I'm all for using that test, I just won't grant that test supremacy when I make my judgments. It's something, it's not everything.

dhsilv2 wrote:Agree that xRAPM is just double counting the box scores btw. It sorta creates a more expected data set while ignoring what I think is the power of RAPM in that it can identify guys who play roles with huge value that doesn't really register on the box score.


The thing that's maddening about it to me is that the nature of priors & regressions is that you can't say that a prior is getting X% of the overall weight, as this isn't some simple arithmetic equation. Even if you could for a given player, different players in the set would have their scores effected by the prior differently. There's just no way to adjust for it, so the whole thing becomes a black box that doesn't mix well with other tools.
Getting ready for the RealGM 100 on the PC Board

Come join the WNBA Board if you're a fan!

Return to Player Comparisons