Normalized & Scaled RAPM Chronology Spreadsheet

Moderator: Doctor MJ

Doctor MJ
Senior Mod
Senior Mod
Posts: 45,346
And1: 14,305
Joined: Mar 10, 2005
Location: Cali
     

Normalized & Scaled RAPM Chronology Spreadsheet 

Post#1 » by Doctor MJ » Wed Apr 2, 2014 3:26 am

Upon request I've put the normalized RAPM data I made using Engelmann's and acrossthecourt's data (thanks also colts18 and whoever else I'm forgetting). It is here:

https://docs.google.com/spreadsheet/ccc ... qaFE#gid=0

Things to note:

-"Normalized" means applied standard deviation to it to adjust for yearly difference for every time the algorithm was ran.

-I was motivated to put it all in a more readable format so I made sheets where each column represents data from a different year. So you can now see the overall RAPM data for a player for every year we have all on one row of one sheet. Same for offense, same for defense.

-Now that I've done this I'll probably add to it with some other things. One of which will be a scaled version where I de-normalize the data using the standard deviation from an APM study (Ilardi's 6 probably). Open to suggestions about better ways to do it that are still simple.

-Data is sorted by the last column "dumb sum" which simply adds up all the players numbers for all the years. I did this primarily so I could see all the heavy hitter at the top and note any data errors (I'm sure I didn't catch them all). It's a flawed way of ranking the players certainly, though I do think it's interesting to look at.

-I think the most useful way to look at data like this is to look for the general standard a player could regularly reach, rather than trying to go by peaks, or by tallying totals.

-Finally, some of this data no longer manages the number Engelmann has up, and I've held off on any '13 or '14 data simply because I haven't paid close enough attention to have confidence in them yet. I'm open to your feedback on updating the data. Obviously I got paranoid when Engelmann started on his xRAPM bender.

Let me know your thoughts.

Cheers,
Doc
Hey: With what's going on in the world, my fuse is shorter than it used to be, and it's leading my lose my cool and then go on self-imposed breaks from things (such as RealGM). Please try to keep it civil, and I'll be looking to do the same.
Gregoire
Veteran
Posts: 2,921
And1: 360
Joined: Jul 29, 2012

Re: Normalized RAPM Chronology Spreadsheet 

Post#2 » by Gregoire » Wed Apr 2, 2014 9:06 am

Impressed with MJ 1998 RAPM - 3,51. Its not typo? Last year, 34year player and... Duncan over career topped it once, Shaq - twice, Lebron - twice... I wonder what RAPM MJ have from 87-96?
nate33 wrote:

Yeah, when ever I make all time comparisons, I pretty much ignore the pre-3PT-line era. The game was so different then. It's apples and oranges. Those guys may be better or may be worse, we're never really going to know.
Doctor MJ
Senior Mod
Senior Mod
Posts: 45,346
And1: 14,305
Joined: Mar 10, 2005
Location: Cali
     

Re: Normalized RAPM Chronology Spreadsheet 

Post#3 » by Doctor MJ » Thu Apr 3, 2014 4:42 am

Update: I fixed some errors with the same player being split based on the format of his name.

Additionally, I added columns to the right with each players 5 best years, and now the data is sorted based on summing those years instead of summing all the years. Still not a player ranking metric I'd care to defend, but I think it's closer to what can reasonably taken from this data.

Some observations:

Since I would presume someone would care here's how the players end up sorted (imperfect as it is of course):

Overall: Garnett, LeBron, Shaq, Duncan, Dirk
Offense: Nash, LeBron, Wade, Shaq, Kobe
Defense: Mutombo, Garnett, Duncan, Robinson, Collins

Of those, the Defense clearly stands out to me.

1st, Mutombo has the lead here despite that we've only covered 1 of his 4 DPOY seasons. Crazy.

2nd, Collins is one of those "um, have I mentioned we realize RAPM is imperfect" things. RAPM data cointinually shows Collins has a very nice defender, but he's got one crazy high RAPM year and it distorts a not-so-smart sorting like this.
Hey: With what's going on in the world, my fuse is shorter than it used to be, and it's leading my lose my cool and then go on self-imposed breaks from things (such as RealGM). Please try to keep it civil, and I'll be looking to do the same.
colts18
Head Coach
Posts: 7,226
And1: 2,997
Joined: Jun 29, 2009

Re: Normalized RAPM Chronology Spreadsheet 

Post#4 » by colts18 » Thu Apr 3, 2014 11:51 pm

Doctor MJ wrote:2nd, Collins is one of those "um, have I mentioned we realize RAPM is imperfect" things. RAPM data cointinually shows Collins has a very nice defender, but he's got one crazy high RAPM year and it distorts a not-so-smart sorting like this.

Why is Jason Collins high an indictment on RAPM? In fact it proves that its a valuable stat because it picks up his contributions. He is a top flight man defender and does all the little things that box score doesn't pick up.

Comparison between Collins, Duncan, and Garnett since 2001:
On court defensive rating:
Collins 99.6
Duncan 99.2
Garnett 101.9

Difference between on court D rating and off court D rating (negative is good):
Collins -6.7
Duncan -4.6
Garnett -4.3


Collins is noted for having a low rebound rate, but he is one of the best at producing rebounds. Here is the list of players from 2001-2014 sorted by best on court defensive rebounding%. Jason Collins is #1 on the list (Duncan is 8th):

http://www.basketball-reference.com/pla ... by=drb_pct

In 2005, Collins put up an absurd defensive season. He had the best prior informed and non prior informed defensive RAPM ever recorded

https://sites.google.com/site/rapmstats ... nsive-rapm

In that season, the Nets had a 98.7 D rating with him on the court. With him off, they had a 110.5 D rating. An -11.5 difference. They went from the equivalent of best defense in the league with him on the court to 3rd worst defense with him off.
Doctor MJ
Senior Mod
Senior Mod
Posts: 45,346
And1: 14,305
Joined: Mar 10, 2005
Location: Cali
     

Re: Normalized RAPM Chronology Spreadsheet 

Post#5 » by Doctor MJ » Fri Apr 4, 2014 1:16 am

colts18 wrote:
Doctor MJ wrote:2nd, Collins is one of those "um, have I mentioned we realize RAPM is imperfect" things. RAPM data cointinually shows Collins has a very nice defender, but he's got one crazy high RAPM year and it distorts a not-so-smart sorting like this.

Why is Jason Collins high an indictment on RAPM? In fact it proves that its a valuable stat because it picks up his contributions. He is a top flight man defender and does all the little things that box score doesn't pick up.

Comparison between Collins, Duncan, and Garnett since 2001:
On court defensive rating:
Collins 99.6
Duncan 99.2
Garnett 101.9

Difference between on court D rating and off court D rating (negative is good):
Collins -6.7
Duncan -4.6
Garnett -4.3


Collins is noted for having a low rebound rate, but he is one of the best at producing rebounds. Here is the list of players from 2001-2014 sorted by best on court defensive rebounding%. Jason Collins is #1 on the list (Duncan is 8th):

http://www.basketball-reference.com/pla ... by=drb_pct

In 2005, Collins put up an absurd defensive season. He had the best prior informed and non prior informed defensive RAPM ever recorded

https://sites.google.com/site/rapmstats ... nsive-rapm

In that season, the Nets had a 98.7 D rating with him on the court. With him off, they had a 110.5 D rating. An -11.5 difference. They went from the equivalent of best defense in the league with him on the court to 3rd worst defense with him off.


I'm not ashamed that a non-superstar scored so high, I'm just skeptical whenever I see one year aberrations. That's the level where to me RAPM is basically telling us "Yeah, you know how the odds are low that you'll get a distorted value in any given year for a player? Well, low doesn't mean zero."

Three things:

1) We've seen what happens when guys make mega leaps from one year to the next in their play in this metric. They get horribly underrated.

2) We've seen what happens when guys have huge falloffs from one year to the next in their play in this metric. They get horribly overrated.

3) We've seen various guys with this metric have one mega year that no one who watches the game thinks is quite that good. Inevitably they come back to earth again.

The remainder of Collins top years paint him as an excellent defender, and I 100% believe in them. But when you have a non-superstar defender see his DRAPM more than double one year and result in the 2nd best single year rating we've ever seen by anyone, you've got to hold on to your wallet.
Hey: With what's going on in the world, my fuse is shorter than it used to be, and it's leading my lose my cool and then go on self-imposed breaks from things (such as RealGM). Please try to keep it civil, and I'll be looking to do the same.
Doctor MJ
Senior Mod
Senior Mod
Posts: 45,346
And1: 14,305
Joined: Mar 10, 2005
Location: Cali
     

Re: Normalized RAPM Chronology Spreadsheet 

Post#6 » by Doctor MJ » Sat Apr 5, 2014 12:25 am

Gregoire wrote:Impressed with MJ 1998 RAPM - 3,51. Its not typo? Last year, 34year player and... Duncan over career topped it once, Shaq - twice, Lebron - twice... I wonder what RAPM MJ have from 87-96?


No typo that I know of. It is indeed impressive and reassuring. I think one of the things that's interesting about it is that while it's offense dominated, the defense is part of it too. The offensive rating there is right about where peak Kobe is, but old man Jordan has a near 1 deviation edge on the basis of him still having very solid impact on defense. The kind of solid impact interestingly, that we basically never see from Kobe in his entire career by these metrics.

I think it's fun looking at the '90s peak guys to see how they are doing in their old age.

Jordan's +3.5 in his mid-30s is pretty jaw dropping. It is worth noting in the other direction how meh he was in Washington.

It's also worth noting Robinson, noted for his lack of longevity as a megastar because of his late start and his eventual energy, remains a +2.0 level player all the way to the bitter end when he's 37 years old.

Have to mention Stockton of course. His numbers are very scattered, but if his first two numbers represent his prime, then he has indeed been underrated by some such as myself.

Along with him is Malone, and it's very interesting to contrast. To me Malone has always had a solid reputation as a defender, but what we seem to be seeing is Stockton ranking ahead of Malone here, and not because he's "the real force behind the offense". No, Malone is clearly the offensive MVP, but Stockton's defensive edge over his co-star is much bigger than I would have expected.

Sabonis regular +2.0 ratings in his 30s long after his injuries are eye opening.

Ewing is interesting. We've long had Ewing Theory and it exists because other guys were at least temporarily able to replace Ewing. It's a bit of a given that no one else on that team could replace his defense, and the numbers here say just that. But his offensive numbers make him look utterly neutral in impact on offense while being his team's 20+ PPG guy. (And that's crazy. I'm known for knocking Melo & AI and the numbers here show why they deserve it, but it's not because they aren't great offensive players, they just aren't the superstars people assume they are. That's very different from actually being a neutral presence out there.) That I think really shows how clear it is that you don't want to simply assume that you should build your offense around your big. Ewing was much more talented than most of our all-star bigs, and yet for most of his career, he probably shouldn't have been shooting enough to score 20 PPG.

Hakeem and Barkley end up sorted right next to each other, which is doubly interesting because they were playing as teammates. It's rather astonishing that even when Barkley sacrifices his offense to play around Hakeem, he's still getting rated far higher than Hakeem on offense. Hakeem's defensive edge is clearly enough to make up for it, but with Barkley it really appears his offense is as good as any imagined and his defense is as bad.

Comparing Hakeem to other bigs of his era, he doesn't come off that amazing here. It's not peak vs peak of course, but both Robinson and Ewing seem to be maintaining their defense more with age than Hakeem did, and that's before we consider that Mutombo at least to this point is clearly setting the bar on that end of the court. Hakeem's defensive impact had much to do with near unprecedented agility and coordination for his size, so perhaps he didn't age as well on that front as the bigger guys. It wouldn't necessarily be an indictment on his prime.
Hey: With what's going on in the world, my fuse is shorter than it used to be, and it's leading my lose my cool and then go on self-imposed breaks from things (such as RealGM). Please try to keep it civil, and I'll be looking to do the same.
drza
Analyst
Posts: 3,506
And1: 1,819
Joined: May 22, 2001

Re: Normalized RAPM Chronology Spreadsheet 

Post#7 » by drza » Sat Apr 5, 2014 2:38 am

Doctor MJ wrote:
Gregoire wrote:Impressed with MJ 1998 RAPM - 3,51. Its not typo? Last year, 34year player and... Duncan over career topped it once, Shaq - twice, Lebron - twice... I wonder what RAPM MJ have from 87-96?


No typo that I know of. It is indeed impressive and reassuring. I think one of the things that's interesting about it is that while it's offense dominated, the defense is part of it too. The offensive rating there is right about where peak Kobe is, but old man Jordan has a near 1 deviation edge on the basis of him still having very solid impact on defense. The kind of solid impact interestingly, that we basically never see from Kobe in his entire career by these metrics.


This data is full of stories like this. Some of my initial impressions

1) I am overjoyed with the strength of the "sniff test" results here. In all three facets, the players that you would expect to be among the top are among the top. The order might be different than expected in some instances, and there are a couple of surprises, but for the most part it's amazing that a stat with no boxscore influence and no user input was able to output consistent lists of who we thought were the best players.

2) I also love the Jordan example you mentioned, and the comparison with Kobe of similar age vis a vis defensive impact and game completeness. And that his score in his walk-off year puts him in the top-10 peaks is outstanding.

3) That LeBron and Shaq had the two highest 1-year peaks are nice and fitting.

4) As I mentioned in the other thread, now having at least some RAPM values for effectively 4 generations of NBA players really helps solidify the stat.

5) Manu Ginobili, Rasheed Wallace and Ron Artest show up as players whose scores mark them as higher impact than even their All Star status would have suggested.

6) Really interesting to see KG's and Duncan's early scores. I remember when the feeling was that the timing of us getting APM data starting in 2003 might have hit KG's peak and missed Duncan's, and thus made KG look better in the comparison, and that if we'd had the early data it would have tilted the comparison more in Duncan's favor. But the early numbers have KG ahead of Duncan even back in the 90s, which I don't think most expected.

7) On the flip side, I feel a bit justified in making a similar argument about Kidd and Pierce that the numbers do bear out...that Pierce was a strong contributor, but when the numbers from Kidd's prime are included he clearly measures out better.

8) On the whole, I can't help but think that if KG and Kobe were swapped in the standings, the amount of resistance that RAPM faces in getting traction around here would completely disappear.

Fun stuff.
Creator of the Hoops Lab: tinyurl.com/mpo2brj
Contributor to NylonCalculusDOTcom
Contributor to TYTSports: https://www.youtube.com/playlist?list=PLTbFEVCpx9shKEsZl7FcRHzpGO1dPoimk
Follow on Twitter: @ProfessorDrz
Doctor MJ
Senior Mod
Senior Mod
Posts: 45,346
And1: 14,305
Joined: Mar 10, 2005
Location: Cali
     

Re: Normalized RAPM Chronology Spreadsheet 

Post#8 » by Doctor MJ » Sat Apr 5, 2014 5:02 am

drza wrote: the whole, I can't help but think that if KG and Kobe were swapped in the standings, the amount of resistance that RAPM faces in getting traction around here would completely disappear.


Yup. So much of this is about the 'sniff test'. Data too far away form our expectations causes cognitive dissonance. I personally tend to justify my hesitance in terms of a recognition of the existence of noise, but one could argue it's not much different than the reluctance of a typical homer.

Folks like us don't have an issue with a stat telling us Garnett was better than Kobe because we have a better sense for what that particular "ballpark" looks like and a recognition of the limitations in precision were previously. Others don't realize this and hence it doesn't take much for them to reach a point where acceptance of RAPM data makes their schema fall apart like a house of cards, and they understandably resist this. Of course to me you don't get analysis at all - whether it's about basketball or anything else - if you let yourself be that fragile. It smacks of just running with conclusions without any sense of the appropriate amount of confidence and hence when you're proven wrong having to admit you're not wrong about 1 thing, but wrong about everything you so vehemently spoke on.

For me the guy I most worry about being a homer about is Nash, so I approach this data trying to ask myself, "What assumptions was I making before that I now need to check?" I'll be proud of myself if I rank him lower as a result of learning more from evidence, and ashamed if I cling to tired views based on emotional inertia.

Of course as I say this, I don't want to change any opinions with too much haste. So when I look at this new data, I'm cautious in how I use it to change my mind. While I've always loved me some Mookie (best name ever), I'm not quite sold on what the data is telling here yet.
Hey: With what's going on in the world, my fuse is shorter than it used to be, and it's leading my lose my cool and then go on self-imposed breaks from things (such as RealGM). Please try to keep it civil, and I'll be looking to do the same.
Doctor MJ
Senior Mod
Senior Mod
Posts: 45,346
And1: 14,305
Joined: Mar 10, 2005
Location: Cali
     

Re: Normalized RAPM Chronology Spreadsheet 

Post#9 » by Doctor MJ » Sun Apr 13, 2014 9:02 pm

I have updated the spreadsheet with 3 additional sheets called "Scaled". Here's what they are:

The original sheets normalize the different RAPM studies by dividing by the standard deviation of each study. Because RAPM loses it's direct relationship to points on a scoreboard as a consequence of the machine learning process, no too studies have the same relationship to the scoreboard or to each other. If we can assume though that the relative variance across NBA players remains basically the same from year to year, then normalizing the data based on standard deviation seems like a good way to get an approximation of actual apples-to-apples comparisons.

Pure APM on the other hand never loses this relationship to the scoreboard, so if we simply take a good APM study's standard deviation and then apply it to the normalized data, that should give us a decent approximation of how much lift a player is truly giving relative to average.

I used Ilardi's 6-year study, which gave a standard deviation of roughly 2.97. Here's the updated spreadsheet with the scaled overall RAPM metric highlighted:

https://docs.google.com/spreadsheet/ccc ... qaFE#gid=3

And just so we have something down for the record here, I'll put some data down. I've got guys currently sorted by their 3rd best season, simply because I'm seeing some guys with two big seasons in the early data that I'm not as sure about. There's nothing magic about this choice of mine, but sorted this way here's what the leaderboard looks like for overall, offense and defense, and I'm putting down the guy's best year as well:

Code: Select all

Overall Scaled RAPM Best  3rd
1. Kevin Garnett    12.65 10.75
2. Shaquille O'Neal 12.82  9.67
3. LeBron James     13.74  9.45
4. Tim Duncan       11.39  9.16
5. Manu Ginobili     9.78  8.82

Offense Scaled RAPM Best  3rd
1. Steve Nash       10.22  8.83
2. LeBron James      9.95  8.76
3. Kobe Bryant       7.92  7.70
4. Shaquille O'Neal  8.59  7.68
5. Dwyane Wade      10.65  7.57

Defense Scaled RAPM Best  3rd
1. Dikembe Mutombo   9.74  8.11
2. Kevin Garnett     7.44  6.90
3. Tim Duncan        6.78  6.38
4. David Robinson    6.78  6.18
5. Bo Outlaw         6.09  5.69
Hey: With what's going on in the world, my fuse is shorter than it used to be, and it's leading my lose my cool and then go on self-imposed breaks from things (such as RealGM). Please try to keep it civil, and I'll be looking to do the same.
Gregoire
Veteran
Posts: 2,921
And1: 360
Joined: Jul 29, 2012

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#10 » by Gregoire » Tue Sep 16, 2014 12:37 pm

–°ould anybody adjust and integrate RAPM of 94-97 to Doctor MJ Normalized & Scaled RAPM Chronology Spreadsheet?
nate33 wrote:

Yeah, when ever I make all time comparisons, I pretty much ignore the pre-3PT-line era. The game was so different then. It's apples and oranges. Those guys may be better or may be worse, we're never really going to know.
dice
RealGM
Posts: 38,765
And1: 10,801
Joined: Jun 30, 2003
Location: chicago

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#11 » by dice » Fri Sep 19, 2014 3:13 pm

simple analysis that shows single year RAPM is to be taken with a super-sized grain of salt:

the pacers and bulls were 1/2 in defense last season. however, when you sum individual DRAPM*possessions for each team, the pacers total dwarfs the bulls total
"'tupelo honey' has always existed. van was the vessel and the earthly vehicle for it" - bob dylan
blabla
Sophomore
Posts: 145
And1: 73
Joined: May 23, 2012

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#12 » by blabla » Fri Sep 19, 2014 7:39 pm

dice wrote:simple analysis that shows single year RAPM is to be taken with a super-sized grain of salt:

the pacers and bulls were 1/2 in defense last season. however, when you sum individual DRAPM*possessions for each team, the pacers total dwarfs the bulls total

Which RAPM source are you using?
(and, just in case, did you make sure not to use minute *totals* for players that were traded mid-season, like Evan Turner?)
dice
RealGM
Posts: 38,765
And1: 10,801
Joined: Jun 30, 2003
Location: chicago

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#13 » by dice » Fri Sep 19, 2014 7:46 pm

blabla wrote:
dice wrote:simple analysis that shows single year RAPM is to be taken with a super-sized grain of salt:

the pacers and bulls were 1/2 in defense last season. however, when you sum individual DRAPM*possessions for each team, the pacers total dwarfs the bulls total

Which RAPM source are you using?
(and, just in case, did you make sure not to use minute *totals* for players that were traded mid-season, like Evan Turner?)

used engleman. both 'pure RAPM' and xRAPM. similar results. didn't include partial seasons of deng and turner. nor scrubs, though i can't imagine those factors would significantly close the gap

pure RAPM:

2.3 6338 taj 14577.4
0.6 1062 rose 637.2
1.1 6799 mdj 7478.9
1.3 7295 noah 9483.5
2.2 5463 kirk 12018.6
0.8 6272 jimmy 5017.6
-1 5975 booz -5975
-2.5 4901 dj -12252.5
-2.1 3785 snell -7948.5
sum = 23037.2

2.3 8326 george 19149.8
0.2 8047 stephenson 1609.4
2.3 6918 hibbert 15911.4
1.3 7014 west 9118.2
1.7 6559 hill 11150.3
1.3 4016 scola 5220.8
1.9 3753 mahinmi 7130.7
0.2 4310 watson 862
sum = 70152.6

xRAPM:

2.5 7087 taj 17717.5
-1 1062 rose -1062
0.9 7763 mdj 6986.7
2.5 8371 noah 20927.5
1.3 6310 kirk 8203
0.4 7488 jimmy 2995.2
-0.7 6771 booz -4739.7
-4.8 5791 dj -27796.8
-1.9 4045 snell -7685.5
sum = 15545.9

1.7 9357 george 15906.9
-0.4 9079 stephenson -3631.6
3.2 7776 hibbert 24883.2
2.4 7979 west 19149.6
0.3 7620 hill 2286
0.6 4476 scola 2685.6
4.2 4062 mahinmi 17060.4
-1.2 4341 watson -5209.2
sum = 73130.9

theoretically, dj augustine alone should have caused a significant drop in the team defense rankings for the bulls. he didn't
"'tupelo honey' has always existed. van was the vessel and the earthly vehicle for it" - bob dylan
blabla
Sophomore
Posts: 145
And1: 73
Joined: May 23, 2012

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#14 » by blabla » Sat Sep 20, 2014 5:37 pm

It's possible that alot of the defensive value is tied up in Thibodeau. xRAPM is coach adjusted and Thibs is rated significantly higher on defense than Vogel
Doctor MJ
Senior Mod
Senior Mod
Posts: 45,346
And1: 14,305
Joined: Mar 10, 2005
Location: Cali
     

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#15 » by Doctor MJ » Sun Sep 21, 2014 10:13 pm

Gregoire wrote:–°ould anybody adjust and integrate RAPM of 94-97 to Doctor MJ Normalized & Scaled RAPM Chronology Spreadsheet?


Has to be noted that that earlier data is not actually RAPM. It's at best an approximation based on raw +/-. I'm glad we have it, but I think it might do more harm than good to put them in the same spreadsheet.
Hey: With what's going on in the world, my fuse is shorter than it used to be, and it's leading my lose my cool and then go on self-imposed breaks from things (such as RealGM). Please try to keep it civil, and I'll be looking to do the same.
Doctor MJ
Senior Mod
Senior Mod
Posts: 45,346
And1: 14,305
Joined: Mar 10, 2005
Location: Cali
     

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#16 » by Doctor MJ » Sun Sep 21, 2014 10:17 pm

dice wrote:simple analysis that shows single year RAPM is to be taken with a super-sized grain of salt:

the pacers and bulls were 1/2 in defense last season. however, when you sum individual DRAPM*possessions for each team, the pacers total dwarfs the bulls total


The reason I find all this so meaningful is that we have quite a few years of data. That combined with using the PI version of RAPM means we're getting this data's noise damped down quite a bit. So if a player has several years up in a certain stratosphere, it's almost certainly legit.
Hey: With what's going on in the world, my fuse is shorter than it used to be, and it's leading my lose my cool and then go on self-imposed breaks from things (such as RealGM). Please try to keep it civil, and I'll be looking to do the same.
dice
RealGM
Posts: 38,765
And1: 10,801
Joined: Jun 30, 2003
Location: chicago

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#17 » by dice » Mon Sep 22, 2014 1:01 am

Doctor MJ wrote:
dice wrote:simple analysis that shows single year RAPM is to be taken with a super-sized grain of salt:

the pacers and bulls were 1/2 in defense last season. however, when you sum individual DRAPM*possessions for each team, the pacers total dwarfs the bulls total


The reason I find all this so meaningful is that we have quite a few years of data. That combined with using the PI version of RAPM means we're getting this data's noise damped down quite a bit. So if a player has several years up in a certain stratosphere, it's almost certainly legit.

how exactly does PI work? does it mean that, for example, the current year's data is modified by previous data? what if i just want a particular year's number as-is, noise and all, untainted by prior seasons?
"'tupelo honey' has always existed. van was the vessel and the earthly vehicle for it" - bob dylan
Doctor MJ
Senior Mod
Senior Mod
Posts: 45,346
And1: 14,305
Joined: Mar 10, 2005
Location: Cali
     

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#18 » by Doctor MJ » Mon Sep 22, 2014 1:28 am

dice wrote:
Doctor MJ wrote:
dice wrote:simple analysis that shows single year RAPM is to be taken with a super-sized grain of salt:

the pacers and bulls were 1/2 in defense last season. however, when you sum individual DRAPM*possessions for each team, the pacers total dwarfs the bulls total


The reason I find all this so meaningful is that we have quite a few years of data. That combined with using the PI version of RAPM means we're getting this data's noise damped down quite a bit. So if a player has several years up in a certain stratosphere, it's almost certainly legit.

how exactly does PI work? does it mean that, for example, the current year's data is modified by previous data? what if i just want a particular year's number as-is, noise and all, untainted by prior seasons?


It means that instead of every player starting out from 0, they start out with some other number based on "informing" factors. Most typically, it would just be the players RAPM from the previous season.

If you just want the year's data informed by nothing else you would use the NPI data, which stands for non-prior informed. It's available, and some prefer using it. Frankly if I'm analyzing the current season, I often prefer it as I don't want that bias in there. However, if we're looking over a large swath of years, just on principle it's better to remove the noise to see where a guy typically is.

There's also something even bigger actually. An issue with RAPM is that it essentially penalizes a player for outlier results of small sample size based on them likely being shaped by randomness. But what if the data is legit? The prior-informed version let's the consistency between the years smooth out those bad assumptions.

This gets concrete in a hurry when you compare Duncan to Garnett. There's a recurring theme where Duncan wins by NPI while Garnett wins by PI. As in, in several years in a row, the same thing happens, which means you're basically talking about consistency between seasons.
Hey: With what's going on in the world, my fuse is shorter than it used to be, and it's leading my lose my cool and then go on self-imposed breaks from things (such as RealGM). Please try to keep it civil, and I'll be looking to do the same.
dice
RealGM
Posts: 38,765
And1: 10,801
Joined: Jun 30, 2003
Location: chicago

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#19 » by dice » Mon Sep 22, 2014 1:46 am

Doctor MJ wrote:It means that instead of every player starting out from 0, they start out with some other number based on "informing" factors. Most typically, it would just be the players RAPM from the previous season.

why not just look at the unadulterated year-by-year career data in NPI form and form one's own conclusions rather than trust the validity of blended one size fits all data?

where do i get NPI data?

The prior-informed version let's the consistency between the years smooth out those bad assumptions.

does it treat all seasons as equivalent regardless of games played? that would be a major flaw

This gets concrete in a hurry when you compare Duncan to Garnett. There's a recurring theme where Duncan wins by NPI while Garnett wins by PI.

how is that possible? obviously one guy could win the occasional season NPI with the other guy winning PI every year due to smoothing, but i don't understand how a player could be consistently winning one and losing the other
"'tupelo honey' has always existed. van was the vessel and the earthly vehicle for it" - bob dylan
colts18
Head Coach
Posts: 7,226
And1: 2,997
Joined: Jun 29, 2009

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#20 » by colts18 » Mon Sep 22, 2014 1:50 am

dice wrote:where do i get NPI data?

http://shutupandjam.net/nba-ncaa-stats/npi-rapm/

how is that possible? obviously one guy could win the occasional season NPI with the other guy winning PI every year due to smoothing, but i don't understand how a player could be consistently winning one and losing the other


It's due to a flaw in the PI RAPM stats the Dr. MJ uses. The PI RAPM he uses had KG ahead of Duncan because a few of those years were missing like 30% of the data. In the 100% complete NPI RAPM data, Duncan finishes ahead of KG so he would have likely finished ahead of him in the PI RAPM too.

Return to Statistical Analysis