Normalized & Scaled RAPM Chronology Spreadsheet

Moderator: Doctor MJ

ceiling raiser
Lead Assistant
Posts: 4,501
And1: 3,728
Joined: Jan 27, 2013

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#41 » by ceiling raiser » Wed Oct 8, 2014 4:57 pm

lorak wrote:^
Before any updates we should discuss methodology Doc used - so for example if using "0" instead of real sample mean is right thing to do? If so what about also adjusting "true standard deviation"? I also would like to see how "scaled RAPM" is calculated?

There are also some issues with source data Doc used - obviously 1998 sample wasn't complete, but also basically every year done by Engelmann has some small noise (some players listed twice, some didn't play in a given year). I actually cleaned all that data, but before I make "normalized" update I would like to know answers to questions listed above.

1) I don't know if we have possession data for all seasons.
2) Some years are incomplete from my understanding. colts (and I believe J.E. himself on the ABPRmetrics board) noted that the 01 play-by-play is incomplete. I'm not sure if 02 is as well, but the prior informed RAPM would have an incomplete prior if some of the data from 01 is missing.
Now that's the difference between first and last place.
User avatar
SideshowBob
General Manager
Posts: 9,056
And1: 6,253
Joined: Jul 16, 2010
Location: Washington DC
 

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#42 » by SideshowBob » Wed Oct 8, 2014 7:19 pm

Yes, if I recall correctly, he said the majority of the playoff data was missing for 2001.
But in his home dwelling...the hi-top faded warrior is revered. *Smack!* The sound of his palm blocking the basketball... the sound of thousands rising, roaring... the sound of "get that sugar honey iced tea outta here!"
Doctor MJ
Senior Mod
Senior Mod
Posts: 50,756
And1: 19,459
Joined: Mar 10, 2005
Location: Cali
     

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#43 » by Doctor MJ » Mon Oct 13, 2014 7:44 am

So to respond:

1. I'll certainly update the spreadsheets, but I'm a bit busy right now with other things.

2. If in the meantime someone else makes an improved version and that becomes the new standard, that's cool.

3. lorak and discussed the normalization process and where I left feeling was that one way to improve it would be to have a more sophisticated process for measuring variance than my approach of just applying SD to the set of values. Weighting by minutes played could be done with different methods and would certainly move us in the right direction.

4. While we're at it, it's worth looking into what's truly the best data set to use as are "standard standard deviation". Currently I'm using Ilardi's 6 year APM data with the rationale that we want it to be APM not RAPM, and that more years means less noise. To be honest though, it would seem more fair to use 1 year studies given that the spreadsheet is based on 1-year RAPMs. I could see an approach where we just take a half dozen (or more) years we're confident in the data used for APM, find the SD for each, and then average them.
Getting ready for the RealGM 100 on the PC Board

Come join the WNBA Board if you're a fan!
lorak
Head Coach
Posts: 6,317
And1: 2,231
Joined: Nov 23, 2009

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#44 » by lorak » Sun Oct 19, 2014 7:30 am

Doc,
I still can't replicate your numbers. For example using your equation from the other thread (Scaled RAPM = Raw RAPM / Study's SD * APM SD) for 2005 Manu I've got 11.4 scaled RAPM, while your spreadsheet says 9.78.

Manu's raw RAPM in 2005 = 6.4
study's SD in 2005 = 1.95
Ilardi's 6 years APM study SD = 3.46

Which of these values are different in your spreadsheet?
Doctor MJ
Senior Mod
Senior Mod
Posts: 50,756
And1: 19,459
Joined: Mar 10, 2005
Location: Cali
     

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#45 » by Doctor MJ » Wed Oct 22, 2014 12:07 am

lorak wrote:Doc,
I still can't replicate your numbers. For example using your equation from the other thread (Scaled RAPM = Raw RAPM / Study's SD * APM SD) for 2005 Manu I've got 11.4 scaled RAPM, while your spreadsheet says 9.78.

Manu's raw RAPM in 2005 = 6.4
study's SD in 2005 = 1.95
Ilardi's 6 years APM study SD = 3.46

Which of these values are different in your spreadsheet?


Ugh. Okay so 2 things:

1) You're only using the first sample, which is of guys with big enough minutes. Keep scrolling down to get the rest.

2) However, the file I was using doesn't have everything on the sheet I just Googled for that was Ilardi's 6 seasons. I don't know why. And if I use the entire data set, I get a different SD from before.

So most glaringly: We once again have an issue where I'm missing some of the data, and it's affecting my SD. That's a real problem.
Getting ready for the RealGM 100 on the PC Board

Come join the WNBA Board if you're a fan!
Reservoirdawgs
Starter
Posts: 2,013
And1: 965
Joined: Dec 21, 2004
Location: Stuck in the middle with you.
     

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#46 » by Reservoirdawgs » Wed Dec 9, 2015 7:46 pm

Doctor MJ wrote:Upon request I've put the normalized RAPM data I made using Engelmann's and acrossthecourt's data (thanks also colts18 and whoever else I'm forgetting). It is here:

https://docs.google.com/spreadsheet/ccc?key=0Am56xyn_resAdHV2M185TGhoazBTcm5WMFV0eXZqaFE#gid=0" onclick="window.open(this.href);return false;


Doc, this has been really helpful, and I go back and look at this every so often when I think of a random player. Do you have any idea when this will be updated with more recent numbers?
So when is this plane going down? I'll ride it til' it hits the ground!
Doctor MJ
Senior Mod
Senior Mod
Posts: 50,756
And1: 19,459
Joined: Mar 10, 2005
Location: Cali
     

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#47 » by Doctor MJ » Fri Dec 11, 2015 7:12 am

Reservoirdawgs wrote:
Doctor MJ wrote:Upon request I've put the normalized RAPM data I made using Engelmann's and acrossthecourt's data (thanks also colts18 and whoever else I'm forgetting). It is here:

https://docs.google.com/spreadsheet/ccc?key=0Am56xyn_resAdHV2M185TGhoazBTcm5WMFV0eXZqaFE#gid=0" onclick="window.open(this.href);return false;


Doc, this has been really helpful, and I go back and look at this every so often when I think of a random player. Do you have any idea when this will be updated with more recent numbers?


I really lost faith in Engelmann's new numbers so I don't really expect to update my spreadsheet with new things from him. What I'd actually like to do is run my own numbers. So yeah, nothing immediate, but it is on my radar.
Getting ready for the RealGM 100 on the PC Board

Come join the WNBA Board if you're a fan!
Reservoirdawgs
Starter
Posts: 2,013
And1: 965
Joined: Dec 21, 2004
Location: Stuck in the middle with you.
     

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#48 » by Reservoirdawgs » Fri Dec 11, 2015 3:44 pm

Doctor MJ wrote:
I really lost faith in Engelmann's new numbers so I don't really expect to update my spreadsheet with new things from him. What I'd actually like to do is run my own numbers. So yeah, nothing immediate, but it is on my radar.


That's too bad. What about his new numbers do you not like? The fact that it is so box score-heavy?
So when is this plane going down? I'll ride it til' it hits the ground!
dontcalltimeout
Senior
Posts: 508
And1: 547
Joined: Nov 21, 2013
Location: city of the big shoulders
 

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#49 » by dontcalltimeout » Fri Dec 11, 2015 6:34 pm

Reservoirdawgs wrote:
Doctor MJ wrote:
I really lost faith in Engelmann's new numbers so I don't really expect to update my spreadsheet with new things from him. What I'd actually like to do is run my own numbers. So yeah, nothing immediate, but it is on my radar.


That's too bad. What about his new numbers do you not like? The fact that it is so box score-heavy?


Well JE did release some vanilla RAPM numbers from 2002 - 2015, but he's no longer doing daisy-chain prior informed. Rather, it's "Multiyear RAPM with lower weight to older seasons".

Data is here: https://www.dropbox.com/sh/teutg7zvxudqnlw/AAAUkNkDUG0KWeewPZbnwS2ja?dl=0
Thread is here: http://www.apbr.org/metrics/viewtopic.php?f=2&t=8964&start=30

But I'm not sure why he prefers this method over the old PI. If anyone can illuminate, it'd be appreciated.
Nitro1118
Senior
Posts: 560
And1: 202
Joined: Dec 12, 2010

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#50 » by Nitro1118 » Fri Dec 11, 2015 11:53 pm

Doctor MJ, if you would find time to answer a couple of questions about RAPM, it would be greatly appreciated...

1) You seem to be a huge believer in weighing RAPM data when it comes to player analysis. However, since it essentially gauges a player's impact to your average team, isn't it very problematic that it only strips data from playing on a 15 man roster for any given season, when there are thousands of different player combinations in the entire league? Sure, it can make predictions, but how reliable can such a prediction be when it uses probably less than 1% of data from the entire theoretical pie? And that is not accounting coaching variability, and even stupid things from how a player reacts to a certain city, how good the training staff is, etc...

2) To what degree of importance do you place impact stats? For instance, look at a player like 2013-current LeBron. In those seasons, he's had a tendency of coasting on both sides of the ball until either his team falls into a hole, or coasting an entire game if his entire team is involved in a blowout situation. So, this can shave a few points off his impact stats compared to other stars. So, let's say he and Chris Paul are a +5 for the season, but Paul plays at consistent high effort while LeBron alternates between coasting and high...just looking at the RAPM, they are equals. However, LeBron's peak impact is higher and he has a tendency to "flip that switch" when needed, so do you factor in that contrxt when reviewing these stats. In that example, LeBron should clearly be the victor, since his peak impact is higher.

On a broader note, the point differential thing is why I feel those Miami teams get underrated in all-time arguments. Their coasting mentality often killed their point differential, but when they needed to flip the switch, they played much better than their+/- differential indicates.

3) At what +/- differential between 2 players do you look at as a big enough gap does your opinion sway in comparing Player A to Player B? For instance, let's say Player A appears to clearly have a lead in major box score stats and a good chunk of advanced stats, and the eye test also tells you he is likely the superior player...but his RAPM data is -2.5 worse than Player B. Is that enough of a gap to sway your opinion. Keep in mind, a middle of the road playoff team like Toronto has a +4 point differential, so 2-3pts doesn't necessarily dictate a W or L.


Basically, I want to be more of a believer in +/- stats, but I just have huge issues in stats that try to account for variables that are simply impossible to do so with the concrete data we currently have. I look at it as just another tool in the toolbox, but right now don't even really value it any more than box score stats, PER, etc... Because at least with those stats they are not predictive in nature, and I can simply use the eye test/situational stats if possible to place those stats within context. I have much greater issue in doing so with RAPM. However, with a guy as smart as you putting so much faith in RAPM, I really want to "get it."

Thanks in advance!
blabla
Sophomore
Posts: 156
And1: 76
Joined: May 23, 2012

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#51 » by blabla » Sat Dec 12, 2015 11:03 am

dontcalltimeout wrote:But I'm not sure why he prefers this method over the old PI. If anyone can illuminate, it'd be appreciated.

Doing multiyear with lower weight to older season is more accurate than daisy-chaining.
Imagine a player X playing every minute of his first season together with LeBron, then never again. Daisy chained he'd get the same value as LeBron (up to that point), even if the team plays great in year 2, after X left the team (or stops playing). LeBron's value might rise once X leaves, but X's value stays high because that's just what we "knew" up until he stopped playing

With multiple years in one regression the regression actually can realize that the team played alot better when he left, and can go ahead and lower his value
Doctor MJ
Senior Mod
Senior Mod
Posts: 50,756
And1: 19,459
Joined: Mar 10, 2005
Location: Cali
     

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#52 » by Doctor MJ » Sat Dec 12, 2015 10:21 pm

Reservoirdawgs wrote:
Doctor MJ wrote:
I really lost faith in Engelmann's new numbers so I don't really expect to update my spreadsheet with new things from him. What I'd actually like to do is run my own numbers. So yeah, nothing immediate, but it is on my radar.


That's too bad. What about his new numbers do you not like? The fact that it is so box score-heavy?


It's a few things:

1) My issue with adding box score into a regression stat in general is that I already have access to all the box score info I want and I already make use of it. Same principle behind why many prefer to get salsa, or hot sauce, or salad dressing on the side rather than put into the dish. If it's on the side, I can mix it how I see fit rather than someone else's tastes.

2) While the analogy to food sheds light, it's more problematic in this case because I cannot immediate "taste" the algorithm and tell if its right. So every time the stats available change, it takes quite a bit of time before I can really use them as confidently as the old stuff. That's an issue even when the old stuff is still available but...

3) Engelmann effectively killed off the old tool I was using which mean that his new stat made my analysis worse. While understand if that seems bitchy of me because I only had the old stat because of him, it was one of many things he did at the time that made clear he didn't really understand how to do more than code & run his algorithms. Examples:

-He didn't seem to understand the complaint.
-He didn't understand at all that the entire purpose of a stat is its use for analysis.
-His issues understanding communication of his data was worse than I'd have believed possible. He didn't simply come up with a new +/-/box score stat and label it RAPM, he also made a box score-based stat merely estimating +/- data for the '90s and put it on his side side by side with the +/- based data and called that RAPM too.

After all this, now even when he says "Oh yeah, it's this new thing is the old thing you liked.", I don't really have any faith in him. He isn't detailed-oriented with things he doesn't understand the value of, and unfortunately it's clear that that includes his original RAPM stat.

I don't like being so scathing about the guy. To the extent he's aware of me at all - we've interacted some, but I feel like he wouldn't remember - he's probably annoyed that I'm whining about something he's providing for free, and he'd have every right to feel that way. The thing is though that when he started making his stats, others stopped making theirs. They didn't do it purely because of him, but there definitely seemed a sense from other stat-makers now consulting for NBA teams that someone else had that essentially covered for the public. As a result, when Engelmann switched up his thing, I was actually in worse shape than I'd been before he arrived on the scene.

Alright, enough of me ranting.
Getting ready for the RealGM 100 on the PC Board

Come join the WNBA Board if you're a fan!
Doctor MJ
Senior Mod
Senior Mod
Posts: 50,756
And1: 19,459
Joined: Mar 10, 2005
Location: Cali
     

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#53 » by Doctor MJ » Sun Dec 13, 2015 12:55 am

Nitro1118 wrote:Doctor MJ, if you would find time to answer a couple of questions about RAPM, it would be greatly appreciated...

1) You seem to be a huge believer in weighing RAPM data when it comes to player analysis. However, since it essentially gauges a player's impact to your average team, isn't it very problematic that it only strips data from playing on a 15 man roster for any given season, when there are thousands of different player combinations in the entire league? Sure, it can make predictions, but how reliable can such a prediction be when it uses probably less than 1% of data from the entire theoretical pie? And that is not accounting coaching variability, and even stupid things from how a player reacts to a certain city, how good the training staff is, etc...

2) To what degree of importance do you place impact stats? For instance, look at a player like 2013-current LeBron. In those seasons, he's had a tendency of coasting on both sides of the ball until either his team falls into a hole, or coasting an entire game if his entire team is involved in a blowout situation. So, this can shave a few points off his impact stats compared to other stars. So, let's say he and Chris Paul are a +5 for the season, but Paul plays at consistent high effort while LeBron alternates between coasting and high...just looking at the RAPM, they are equals. However, LeBron's peak impact is higher and he has a tendency to "flip that switch" when needed, so do you factor in that contrxt when reviewing these stats. In that example, LeBron should clearly be the victor, since his peak impact is higher.

On a broader note, the point differential thing is why I feel those Miami teams get underrated in all-time arguments. Their coasting mentality often killed their point differential, but when they needed to flip the switch, they played much better than their+/- differential indicates.

3) At what +/- differential between 2 players do you look at as a big enough gap does your opinion sway in comparing Player A to Player B? For instance, let's say Player A appears to clearly have a lead in major box score stats and a good chunk of advanced stats, and the eye test also tells you he is likely the superior player...but his RAPM data is -2.5 worse than Player B. Is that enough of a gap to sway your opinion. Keep in mind, a middle of the road playoff team like Toronto has a +4 point differential, so 2-3pts doesn't necessarily dictate a W or L.


Basically, I want to be more of a believer in +/- stats, but I just have huge issues in stats that try to account for variables that are simply impossible to do so with the concrete data we currently have. I look at it as just another tool in the toolbox, but right now don't even really value it any more than box score stats, PER, etc... Because at least with those stats they are not predictive in nature, and I can simply use the eye test/situational stats if possible to place those stats within context. I have much greater issue in doing so with RAPM. However, with a guy as smart as you putting so much faith in RAPM, I really want to "get it."

Thanks in advance!


Hi Nitro, always happy to help:

1) Sounds like what you're bothered by is what some of us call the goodness problem, or the distinction between value & goodness. The reality is that +/- stats only ever tell us about what a guy can do in the context he's in, and that's far from the only context in the basketball world. They thus shed much more light on value rather than goodness.

What I tend to point out to people is that the same is true for all stats. A player's ability to score in one context will be different from another context, and hence the question of how truly good a player is in scoring is a deeper problem than that. This isn't to say I don't use box score stats of course, but rather that when you really think about it, +/- doesn't have new problems that box score stats don't have. It has basically the same issues, but has some worse than the box score and some better. In the end then, it just makes the most sense to use everything you've got to help you understand what's going on.

2) How important are impact stats to me? It's a great question, but not one that can be answered as well as I'd like. People tend to see me as someone who advocates that impact stats are the holy grail, and then call me a hypocrite when I respond to criticisms of an impact stat by talking about context beyond the stat, when the reality is that all that stuff is going on in my head whenever I'm doing my actual analysis - and that's my priority: RealGM and other places are fun, motivational, and educational, but it all means very little if things don't make sense to me when I'm looking at what's going on by myself. I don't know if that makes sense, but my personality is that when I say things don't "feel right", I don't just mean it as a synonym for not understanding something, I'm saying they actually make me uncomfortable until I can make sense of them. Perhaps everyone feels like that, but I tend to keep looking at a thing until I no longer feel intellectual vertigo as the sight.

Okay so that probably sounds pretentious as hell. Let me try again: I made this spreadsheet, because being able to see all this data to me tells me more about the value a guy actually contributed over time than any other spreadsheet I could think to make. In this sense, impact stats are the paramount stat in my arsenal. They remain however a first-pass thing. If someone asks me what I think about who had a better career between two players of the era, I'll absolutely look at my spreadsheet to compare. But I also already know who those players are as players if they spent any significant amount of time in the league so I'm reviewing year by year their history as I'm reviewing year by year the +/- data. To the extent I cannot really remember their history, then I'll pop over to b-r and refresh my memory, but that typically only happens when we really get into the nitty gritty.

Last note on this: If it's not clear, part of what I'm talking bout when talking about a player's history, is knowing what they were doing on the court, how they did it, how often they did it, and who they did it with. Nick Collison putting up some huge RAPM numbers absolutely made me think highly of him, and made me believe he was underrated by most. It also made me wonder if maybe he should have had a bigger role on the Thunder. But it never made me think "Collison is the true star of the team." A player like that can have the advantage of being played only when he supplies a lineup advantage, and when he's really working to that effect, he's often something of a stealth impactor succeeding in part because the opposing team doesn't realize how much he's burning them. This is a farcry from being able to provide big impact for your team for as many minutes as possible despite the opponent doing all they can to stop it.

And so in this sense, the box score trumps the +/-. While it's quite possible for a Collison to have more impact than some in a starring role who aren't really working cohesively with their team, he's not seriously in the same ballpark as players with heavy primacy putting +/- numbers that appear to be in the same ballpark as his.

Re: flipping the switch. That's a concern definitely, but I think you planted the seeds to the solution: You have to have a sense already for when a player might be in something less than a killer mode in order to account for it, but that's not so hard. Not saying I catch it every time, just saying, if we're seriously having a discussion about a guy's +/- history, we can figure such things out typically.

With that said, as with all other applications of "flipping the switch", there's a serious danger of giving too much power to the idea...and I really think I actually did that when analyzing LeBron's Heat. When they were on their huge winning streak in '12-13, I started having discussion with people about whether this was truly an all-time great team despite the fact the stats in general didn't say so. In the end as sports fans, what's tough is that we'll never truly know. I can point to the fact that those Heatles really never looked ultra-dominant in any playoffs to explain why in retrospect I don't really think the team was vastly superior to it's stats, and why I now see them as not really any kind of GOAT candidate, but then others can talk about injuries etc too.

To end with this, I'd just point out that I assumed I'd see a pretty major "flipping the switch" issue with Shaq during his lazy years, because to me he is THE flip-switch guy in my basketball watching life. Turns out, not so much. His piss poor attitude undoubtedly had a major effect on morale, but on the court, the dude had major impact basically always, and the same frankly has been true of LeBron.

3) How big of a +/- gap is big enough to sway the comparison? Again a great question, but I'd be lying if I had a specific number in my head. It's just too fluid for that. Let me go through Real Plus Minus from last year though just to shed some light. Of course I was paying attention all throughout the season, so it's not like I just looked at these numbers at the end and went from there - my qualitative sense for these things comes first.

So, the RPM leaderboard last year as sorted by the "WINS" column that factors in how much guys actually played:

1. Harden
2. Curry
3. LeBron
4. Paul
5. Davis
6. Green
7. Westbrook
8. Kawhi
9. Middleton
10. DeAndre

My top 5 and 5 honorable mentions in alpha order determined after the Finals:

1. Curry
2. LeBron
3. Harden
4. Paul
5. Davis
HM: Marc Gasol
HM: Green
HM: Kawhi
HM: Wall
HM: Westbrook

So you can see a big similarity indicating the importance of the stat. Know that "+/- stats", and "Real Plus Minus" definitely weren't synonyms there. I looked at other +/- stats too.

Focusing on differences:

1. Curry beat Harden pretty easily in my book. Partly it was because he played less than Harden not because he couldn't play more, but because his team didn't need him too. The much bigger deal though to me was that Curry was helping a much better team, and I felt they style he was playing scaled far better with a great team than Harden's style did. I'm actually a big Harden fan, and I think he could re-pattern his game to be better suited at leading a contender, but that change is far from trivial, and even with its successful transition, I don't think it'd work as well as Curry's set of strengths.

2. LeBron beat Harden - not easily, but also not with any kind of great soul searching. With a guy like LeBron who has already proven he can have insane impact, the question is less about his specific numbers, and more about him giving his team what was needed. Taken over the entire year, Harden had more impact in the sense that he "lifted" his team more, but LeBron did what he needed to, and when he needed to do more than Harden, I felt like he did it.

3. Incidentally, Harden vs Paul was a very tough one for me and one I still wonder if I was too hard on Paul about. I don't like admitting that a little bit of luck may have swayed me, but I think I'd have had Paul at #3 had the Clippers won the Rocket series. I really have more faith in Paul playing his style, than Harden playing the style he's played in Houston, but when Harden contributes more lift through the year, and his team wins out in the playoffs, to me that's when I need to start saying "Yeah, most times, Paul's team goes farther...but they failed to do so this time, and there are consequences for this."

4. As mentioned, I prefer not to rank 6-10, so while Westbrook looks like he dropped based on where I put him, he didn't necessarily...I was however very critical of him last year. To me he played a style that really doesn't lead to a good basketball team, and if you look at RAPM that's not shaped by box score like RPM does, I think you see that. (Apologies, I don't remember what the link was for '14-15 RAPM data, sufficed to say, it was much more critical of him in general and defense specifically.)

With that said, being able to lead a bad team to be decent is still a major accomplishment by most standards, and I didn't feel like their were 10 guys more impressive than Westbrook last year.

5. Middleton missed my top 10, but I have a ton of respect for him. He's of this new breed with Kawhi and Green that in the past would have been mistaken for being far more limited than they actually are. But still, playing that role while helping lead elite teams is considerably more impressive than doing it on a team that isn't really accomplishing that much. Yes the Bucks had a great defense last year, and Middleton was a big part of that, but some of that was stealth success. Teams weren't going into Milwaukee desperately trying to figure out how they could win. There were many ways to skin that particular cat, and hence while Middleton was impressive, he wasn't quite as impressive as the +/- stats would indicate.

6. DeAndre even more so than Westbrook is someone I was skeptical of for box score issue reasons. Defensive rebounds & blocks gained without smart play can actually be a really bad sign. Real Plus Minus factors in those stats directly, and thus DeAndre looks pretty solid on defense, but when you actually look at regression data specifically on the Four Factors of defense, you see it's a problem.

7. As for Gasol & Wall, I tend to find it pretty hard to single guys out toward the bottom of the Top 10, so I could have gone with other guys here, but I think those guys are quite good.
Getting ready for the RealGM 100 on the PC Board

Come join the WNBA Board if you're a fan!
Gregoire
Analyst
Posts: 3,320
And1: 547
Joined: Jul 29, 2012

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#54 » by Gregoire » Sat Oct 24, 2020 5:30 pm

Doctor MJ wrote:


Doc, do you have RAPM numbers 2012-2020?
Heej wrote:
These no calls on LeBron are crazy. A lot of stars got foul calls to protect them from the league. That's gonna be the most enduring take from his career. :lol:
falcolombardi wrote:
Come playoffs 18 lebron beats any version of jordan :lol:
Doctor MJ
Senior Mod
Senior Mod
Posts: 50,756
And1: 19,459
Joined: Mar 10, 2005
Location: Cali
     

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#55 » by Doctor MJ » Sun Oct 25, 2020 12:55 am

Gregoire wrote:
Doctor MJ wrote:


Doc, do you have RAPM numbers 2012-2020?


Nope. Haven't been keeping up.
Getting ready for the RealGM 100 on the PC Board

Come join the WNBA Board if you're a fan!
yeisthegoat
Ballboy
Posts: 3
And1: 2
Joined: Dec 27, 2020
     

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#56 » by yeisthegoat » Sun Dec 27, 2020 7:06 am

Doctor MJ wrote:-"Normalized" means applied standard deviation to it to adjust for yearly difference for every time the algorithm was ran.
Doc


Hello! This is my first post here and created this account just to reply to this post, so maybe I haven't read a prior post about it, but instead of "[applying] standard deviation" why don't you just do a Z-Score Test?
Doctor MJ
Senior Mod
Senior Mod
Posts: 50,756
And1: 19,459
Joined: Mar 10, 2005
Location: Cali
     

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#57 » by Doctor MJ » Sun Dec 27, 2020 7:22 am

yeisthegoat wrote:
Doctor MJ wrote:-"Normalized" means applied standard deviation to it to adjust for yearly difference for every time the algorithm was ran.
Doc


Hello! This is my first post here and created this account just to reply to this post, so maybe I haven't read a prior post about it, but instead of "[applying] standard deviation" why don't you just do a Z-Score Test?


Hello yeisthegoat and welcome!

It's been a long time since I made the spreadsheet so I don't remember the details, but I think I was doing a Z-score as part of this process, I just wasn't calling it that. Do you see something in what I was doing that specifically deviates from that?

In terms of why I phrased it as I did, the motivation for the spreadsheet was put things on more apples-to-apples footing. I did what I did because there were specific issues I wanted to address. I made the assumption that all years should have the same variance, so normalized with standard deviation. Beyond that, I believe I tied the scaling to Ilardi's 5-year APM study because APM scaling doesn't distort the scoreboard. A point in APM is a point on the scoreboard. For RAPM this ceases to be the case.

Make sense?
Getting ready for the RealGM 100 on the PC Board

Come join the WNBA Board if you're a fan!
yeisthegoat
Ballboy
Posts: 3
And1: 2
Joined: Dec 27, 2020
     

Re: Normalized & Scaled RAPM Chronology Spreadsheet 

Post#58 » by yeisthegoat » Sun Dec 27, 2020 8:40 am

Doctor MJ wrote:
yeisthegoat wrote:
Doctor MJ wrote:-"Normalized" means applied standard deviation to it to adjust for yearly difference for every time the algorithm was ran.
Doc


Hello! This is my first post here and created this account just to reply to this post, so maybe I haven't read a prior post about it, but instead of "[applying] standard deviation" why don't you just do a Z-Score Test?


Hello yeisthegoat and welcome!

It's been a long time since I made the spreadsheet so I don't remember the details, but I think I was doing a Z-score as part of this process, I just wasn't calling it that. Do you see something in what I was doing that specifically deviates from that?

In terms of why I phrased it as I did, the motivation for the spreadsheet was put things on more apples-to-apples footing. I did what I did because there were specific issues I wanted to address. I made the assumption that all years should have the same variance, so normalized with standard deviation. Beyond that, I believe I tied the scaling to Ilardi's 5-year APM study because APM scaling doesn't distort the scoreboard. A point in APM is a point on the scoreboard. For RAPM this ceases to be the case.

Make sense?


Thank you for clearing it up, and great work! If you would like I could like you a APM spreadsheet that Goldstein used for PIPM post 97' if you would like.

Return to Statistical Analysis