Penalized Regression of WOWY data

Moderators: Clyde Frazier, Doctor MJ, trex_8063, penbeast0, PaulieWal

lessthanjake
Veteran
Posts: 2,975
And1: 2,690
Joined: Apr 13, 2013

Re: Penalized Regression of WOWY data 

Post#121 » by lessthanjake » Fri Aug 4, 2023 3:14 am

One thing I just want to flag for people is that we should be careful to look at the years for people on these charts and think about what it’s encompassing, rather than just eyeballing them and drawing an immediate conclusion, because a lot of these data points are for time periods someone didn’t actually play much in, and that effects how this stuff looks at first glance.

For instance, these charts look a bit artificially good for people who retire before they decline. If you retire, then there’s several five-year windows after your retirement that only include prime years and therefore where your score will still look really good. If you keep playing as you decline, then those subsequent five-year time periods that include post-prime years will look worse than they’d look if you’d retired, since it’s including declined years in there where your impact has declined. This ends up making people who retired early (Magic Johnson being a great example) look at first glance like they were on top of the league for way longer than they were even actually playing. If you just eyeball the charts, it looks like Magic dominated some time periods he hardly played in! So we should keep that sort of thing in mind.

Similarly, a first glance at the charts looks really hard on people who started out slow in terms of impact, even if it’s just for a year or two. Even just one season that’s weak impact-wise can prevent a given five-year time period from looking elite. And the charts go all the way back starting 4 years before someone started in the NBA (i.e. the first time period that has someone’s rookie season right at the end). So if you have a slow (or even just not stellar) rookie season impact-wise, it’ll end up being part of 5 different data points, and for all but one of them it’ll be of outsized importance (since there won’t be 5 actual full seasons played in those time periods). So a slow first season or two can tank a lot of data points, even if someone quickly got better. We see that with LeBron and others. On the other hand, people who started out great from the beginning (again, Magic Johnson being an example, and Larry Bird is another) end up looking great in those timeframes that go all the way back to years before they started playing, because their first year or two that have outsized importance in those early data points are good. Of course, starting out great is a good thing. But my point is that it ends up having an outsized effect at first glance on the graphs, since the early data points go back before someone started playing, so a really good rookie season ends up looking like a really good half-decade.

Magic Johnson is a bit of a perfect storm of both these factors. He was really good even as a rookie and he retired in his prime. And he even had that super brief comeback as a mostly bench player in 1996 that perpetuates things even further (though, to his credit, this is only because he did actually have good impact in that brief comeback). So, a first glance at the charts makes a guy who really only was a meaningful player from 1979-80 to 1990-1991 look like he was dominating the league from 1976-2000.

I think the best way to look at these charts is probably to *mostly* just zero in on the timeframes in which these players were NBA players the entire time.
OhayoKD wrote:Lebron contributes more to all the phases of play than Messi does. And he is of course a defensive anchor unlike messi.
User avatar
homecourtloss
RealGM
Posts: 11,282
And1: 18,690
Joined: Dec 29, 2012

Re: Penalized Regression of WOWY data 

Post#122 » by homecourtloss » Fri Aug 4, 2023 4:49 am

Moonbeam wrote:
homecourtloss wrote:I feel like we’re getting an exclusive service not available anywhere else for free. Is it possible to run one for Drexler and Terry Porter?

And then one for Ewing, Pippen, Barkley, Reggie Miller, and Payton?


Here you go! Surprised Drexler outpaced Porter that much.

Image

Image


I am, too.

Pippen looking like a monster.
lessthanjake wrote:Kyrie was extremely impactful without LeBron, and basically had zero impact whatsoever if LeBron was on the court.

lessthanjake wrote: By playing in a way that prevents Kyrie from getting much impact, LeBron ensures that controlling for Kyrie has limited effect…
User avatar
Moonbeam
Forum Mod - Blazers
Forum Mod - Blazers
Posts: 10,213
And1: 5,061
Joined: Feb 21, 2009
Location: Sydney, Australia
     

Re: Penalized Regression of WOWY data 

Post#123 » by Moonbeam » Fri Aug 4, 2023 6:09 am

Doctor MJ wrote:.


At present, I only have data back to 1952 because of the minutes requirement I'm using. I've thought about modelling minutes based on available box score stats for earlier periods so we can maybe get something for Groza, Feerick, etc.

- Rochester Royals if we can get good numbers at least back to their joining of the BAA. (NBL back to '45-46 would be amazing, but the data is super sparse)
Key players: Bob Davies, Arnie Risen, Bobby Wanzer, Jack Coleman, Arnie Johnson

Image

- Minneapolis Lakers ideally back to their joining of the BAA.
Key players: George Mikan, Jim Pollard, Herm Schaeffer, Slater Martin, Vern Mikkelsen, Clyde Lovellette

Note: No Schaefer due to no MP.

Image

- Syracuse Nationals
Key players: Dolph Schayes, Paul Seymour, Red Rocha, Earl Lloyd, George King, Red Kerr

Image

- Philadelphia Warriors
Key players: Paul Arizin, Neil Johnston, Jack George, Tom Gola, Wilt Chamberlain

Image

- Boston Celtics
Key players: Bob Cousy, Ed Macauley, Bill Sharman, Bill Russell, Tom Heinsohn, Frank Ramsey

Image

- Boston Celtics
Key players: Bill Russell, Sam Jones, John Havlicek, KC Jones, Tom Sanders, Bailey Howell

Image

- Boston Celtics
Key players: John Havlicek, Dave Cowens, Jo Jo White, Paul Silas, Don Chaney, Don Nelson

Image

- St. Louis Hawks
Key players: Bob Pettit, Cliff Hagan, Lenny Wilkens, Clyde Lovellette, Zelmo Beaty, Lou Hudson

Image

- Philadelphia 76ers
Key players: Wilt Chamberlain, Hal Greer, Chet Walker, Billy Cunningham, Luke Jackson, Wali Jones

Image

- Los Angeles Lakers
Key players: Elgin Baylor, Jerry West, Dick Barnett, Rudy LaRusso, Wilt Chamberlain, Gail Goodrich

Image

- New York Knicks
Key players: Walt Frazier, Willis Reed, Dave DeBusschere, Dick Barnett, Earl Monroe, Bill Bradley

Image

- Milwaukee Bucks
Key players: Kareem Abdul-Jabbar, Oscar Robertson, Bob Dandridge, Jon McGlocklin, Greg Smith

Image
User avatar
Moonbeam
Forum Mod - Blazers
Forum Mod - Blazers
Posts: 10,213
And1: 5,061
Joined: Feb 21, 2009
Location: Sydney, Australia
     

Re: Penalized Regression of WOWY data 

Post#124 » by Moonbeam » Fri Aug 4, 2023 10:46 am

lessthanjake wrote:One thing I just want to flag for people is that we should be careful to look at the years for people on these charts and think about what it’s encompassing, rather than just eyeballing them and drawing an immediate conclusion, because a lot of these data points are for time periods someone didn’t actually play much in, and that effects how this stuff looks at first glance.

For instance, these charts look a bit artificially good for people who retire before they decline. If you retire, then there’s several five-year windows after your retirement that only include prime years and therefore where your score will still look really good. If you keep playing as you decline, then those subsequent five-year time periods that include post-prime years will look worse than they’d look if you’d retired, since it’s including declined years in there where your impact has declined. This ends up making people who retired early (Magic Johnson being a great example) look at first glance like they were on top of the league for way longer than they were even actually playing. If you just eyeball the charts, it looks like Magic dominated some time periods he hardly played in! So we should keep that sort of thing in mind.

Similarly, a first glance at the charts looks really hard on people who started out slow in terms of impact, even if it’s just for a year or two. Even just one season that’s weak impact-wise can prevent a given five-year time period from looking elite. And the charts go all the way back starting 4 years before someone started in the NBA (i.e. the first time period that has someone’s rookie season right at the end). So if you have a slow (or even just not stellar) rookie season impact-wise, it’ll end up being part of 5 different data points, and for all but one of them it’ll be of outsized importance (since there won’t be 5 actual full seasons played in those time periods). So a slow first season or two can tank a lot of data points, even if someone quickly got better. We see that with LeBron and others. On the other hand, people who started out great from the beginning (again, Magic Johnson being an example, and Larry Bird is another) end up looking great in those timeframes that go all the way back to years before they started playing, because their first year or two that have outsized importance in those early data points are good. Of course, starting out great is a good thing. But my point is that it ends up having an outsized effect at first glance on the graphs, since the early data points go back before someone started playing, so a really good rookie season ends up looking like a really good half-decade.

Magic Johnson is a bit of a perfect storm of both these factors. He was really good even as a rookie and he retired in his prime. And he even had that super brief comeback as a mostly bench player in 1996 that perpetuates things even further (though, to his credit, this is only because he did actually have good impact in that brief comeback). So, a first glance at the charts makes a guy who really only was a meaningful player from 1979-80 to 1990-1991 look like he was dominating the league from 1976-2000.

I think the best way to look at these charts is probably to *mostly* just zero in on the timeframes in which these players were NBA players the entire time.


These are fair points. It's a tough balance between getting enough interconnectedness between players to have a reasonable amount of information to inform the models with a larger window size and the lingering effects of big changes (rookie years, final years, switching teams) lasting some time. It's tough to know what the best strategy is. For some 3-year windows I've looked at, some players actually have an NA as a Ridge score due to perfect multicollinearity. Using Minutes Played in a game would get around that a little bit, but there are other drawbacks I'll mention in a separate post.
User avatar
Moonbeam
Forum Mod - Blazers
Forum Mod - Blazers
Posts: 10,213
And1: 5,061
Joined: Feb 21, 2009
Location: Sydney, Australia
     

Re: Penalized Regression of WOWY data 

Post#125 » by Moonbeam » Fri Aug 4, 2023 11:02 am

I'm looking at potential modifications to see if they may offer improvements. A couple main ones I've thought about for the WOWY matrix:

* Instead of a MPG for the season threshold (e.g. 18 MPG to be included), have it be a game-by-game thing, so a player who plays >= 18 minutes in a game is counted, but those who play fewer are not
* Instead of 1 or -1, include minutes played

These are going to present some big challenges though. I've looked into the 1996-97 Utah Jazz as an example. John Stockton, Karl Malone, Antoine Carr, and Howard Eisley played all 102 games for Utah that season (82 regular season and 20 playoff games). Here's how their minutes played in games plots against Utah's margin of victory:

John Stockton: +8.7 on, +7.6 on-off

Image

Correlation between MP and Utah's margin: -0.502

Karl Malone: +11.7 on, +21.9 on-off

Image

Correlation between MP and Utah's margin: -0.558

Antoine Carr: -2.7 on, -14.5 on-off

Image

Correlation between MP and Utah's margin: 0.179

Howard Eisley: +1.0 on, -7.8 on-off

Image

Correlation between MP and Utah's margin: 0.531

So what's happening is that Utah being really good means they have more blowout wins than blowout losses, so Stockton and Malone see their fewest minutes Utah has their biggest wins. Carr and especially Eisley benefit from this by tending to play more minutes in these blowouts.

A regression model using minutes played would likely think Malone and Stockton are negative impact players because of this, and Eisley is a positive impact player. Their on-off scores tell the opposite story. Setting a minimum minute threshold of, say, 18 minutes will carve out the more competitive games where he played fewer minutes from Eisley's sample, potentially making this worse.

I could try to detect blowouts through the minute profile of the game and adjust thresholds and minutes accordingly, but it's going to take me awhile to think about.
User avatar
WestGOAT
Veteran
Posts: 2,594
And1: 3,518
Joined: Dec 20, 2015

Re: Penalized Regression of WOWY data 

Post#126 » by WestGOAT » Fri Aug 4, 2023 12:46 pm

Moonbeam wrote:
Spoiler:
I'm looking at potential modifications to see if they may offer improvements. A couple main ones I've thought about for the WOWY matrix:

* Instead of a MPG for the season threshold (e.g. 18 MPG to be included), have it be a game-by-game thing, so a player who plays >= 18 minutes in a game is counted, but those who play fewer are not
* Instead of 1 or -1, include minutes played

These are going to present some big challenges though. I've looked into the 1996-97 Utah Jazz as an example. John Stockton, Karl Malone, Antoine Carr, and Howard Eisley played all 102 games for Utah that season (82 regular season and 20 playoff games). Here's how their minutes played in games plots against Utah's margin of victory:

John Stockton: +8.7 on, +7.6 on-off

Image

Correlation between MP and Utah's margin: -0.502

Karl Malone: +11.7 on, +21.9 on-off

Image

Correlation between MP and Utah's margin: -0.558

Antoine Carr: -2.7 on, -14.5 on-off

Image

Correlation between MP and Utah's margin: 0.179

Howard Eisley: +1.0 on, -7.8 on-off

Image

Correlation between MP and Utah's margin: 0.531

So what's happening is that Utah being really good means they have more blowout wins than blowout losses, so Stockton and Malone see their fewest minutes Utah has their biggest wins. Carr and especially Eisley benefit from this by tending to play more minutes in these blowouts.

A regression model using minutes played would likely think Malone and Stockton are negative impact players because of this, and Eisley is a positive impact player. Their on-off scores tell the opposite story. Setting a minimum minute threshold of, say, 18 minutes will carve out the more competitive games where he played fewer minutes from Eisley's sample, potentially making this worse.

I could try to detect blowouts through the minute profile of the game and adjust thresholds and minutes accordingly, but it's going to take me awhile to think about.


Not sure if this will work as you intend, unless you also use the actual point margins that overlapped with the specific minutes played, and then you'd basically be doing something similar to RAPM ( but instead of possessions it would be minutes?) right?

The rationale behind for taking MP into account is to better separate between role-players playing limited vs big-minute players right? Why not stick to the original WOWY matrix, but then factor the value you obtain by minutes played/48?

For example, Magic I believe had + points-margin/game (is this the right unit?) of 6? If he played 40 mpg then do 6*(40/48). In the case of Ed Nealy was it 4? If he averaged 15 mpg then 4*(15/48). If that makes sense.

edit:
okay maybe that doesn't make sense :lol: now that I think more about it, if you want points-margin per minute in this case you have to do 6/40 for Magic and for Ed it would be 4/15. oops :lol:
Image
spotted in Bologna
User avatar
eminence
RealGM
Posts: 16,745
And1: 11,580
Joined: Mar 07, 2015

Re: Penalized Regression of WOWY data 

Post#127 » by eminence » Fri Aug 4, 2023 1:19 pm

JE had similar issues with his 90s 'RAPM' and did his simulated box-score thing, but I really think that's getting too far into the weeds, I like it more as it is currently vs going the estimation within a simulation route.

How severe does the collinearity problem look at 3/4/5 year splits? I imagine below that it's extreme, and above that you're getting into career range.
I bought a boat.
User avatar
Moonbeam
Forum Mod - Blazers
Forum Mod - Blazers
Posts: 10,213
And1: 5,061
Joined: Feb 21, 2009
Location: Sydney, Australia
     

Re: Penalized Regression of WOWY data 

Post#128 » by Moonbeam » Fri Aug 4, 2023 1:50 pm

WestGOAT wrote:
Moonbeam wrote:
Spoiler:
I'm looking at potential modifications to see if they may offer improvements. A couple main ones I've thought about for the WOWY matrix:

* Instead of a MPG for the season threshold (e.g. 18 MPG to be included), have it be a game-by-game thing, so a player who plays >= 18 minutes in a game is counted, but those who play fewer are not
* Instead of 1 or -1, include minutes played

These are going to present some big challenges though. I've looked into the 1996-97 Utah Jazz as an example. John Stockton, Karl Malone, Antoine Carr, and Howard Eisley played all 102 games for Utah that season (82 regular season and 20 playoff games). Here's how their minutes played in games plots against Utah's margin of victory:

John Stockton: +8.7 on, +7.6 on-off

Image

Correlation between MP and Utah's margin: -0.502

Karl Malone: +11.7 on, +21.9 on-off

Image

Correlation between MP and Utah's margin: -0.558

Antoine Carr: -2.7 on, -14.5 on-off

Image

Correlation between MP and Utah's margin: 0.179

Howard Eisley: +1.0 on, -7.8 on-off

Image

Correlation between MP and Utah's margin: 0.531

So what's happening is that Utah being really good means they have more blowout wins than blowout losses, so Stockton and Malone see their fewest minutes Utah has their biggest wins. Carr and especially Eisley benefit from this by tending to play more minutes in these blowouts.

A regression model using minutes played would likely think Malone and Stockton are negative impact players because of this, and Eisley is a positive impact player. Their on-off scores tell the opposite story. Setting a minimum minute threshold of, say, 18 minutes will carve out the more competitive games where he played fewer minutes from Eisley's sample, potentially making this worse.

I could try to detect blowouts through the minute profile of the game and adjust thresholds and minutes accordingly, but it's going to take me awhile to think about.


Not sure if this will work as you intend, unless you also use the actual point margins that overlapped with the specific minutes played, and then you'd basically be doing something similar to RAPM ( but instead of possessions it would be minutes?) right?

The rationale behind for taking MP into account is to better separate between role-players playing limited vs big-minute players right? Why not stick to the original WOWY matrix, but then factor the value you obtain by minutes played/48?

For example, Magic I believe had + points-margin/game (is this the right unit?) of 6? If he played 40 mpg then do 6*(40/48). In the case of Ed Nealy was it 4? If he averaged 15 mpg then 4*(15/48). If that makes sense.

edit:
okay maybe that doesn't make sense :lol: now that I think more about it, if you want points-margin per minute in this case you have to do 6/40 for Magic and for Ed it would be 4/15. oops :lol:


Ed Nealy GOAT arc confirmed!

Yeah, it’s going to be tricky coming up with a sensible approach to this. One relatively simple thing to do is to impose a penalty factor that is weighted my MPG (or total minutes) across the sample. That way the 18-20 MPG guys would have harsher penalties than the 40 MPG guys. They would cluster more around 0, which would likely make the higher minute guys spread toward the extremes.
User avatar
Moonbeam
Forum Mod - Blazers
Forum Mod - Blazers
Posts: 10,213
And1: 5,061
Joined: Feb 21, 2009
Location: Sydney, Australia
     

Re: Penalized Regression of WOWY data 

Post#129 » by Moonbeam » Fri Aug 4, 2023 1:52 pm

eminence wrote:JE had similar issues with his 90s 'RAPM' and did his simulated box-score thing, but I really think that's getting too far into the weeds, I like it more as it is currently vs going the estimation within a simulation route.

How severe does the collinearity problem look at 3/4/5 year splits? I imagine below that it's extreme, and above that you're getting into career range.


Yeah, there is a sort of beauty in the simplicity of this as it currently stands. I’m stilling running the 3-year windows and have to see how they compare.
OhayoKD
Lead Assistant
Posts: 5,920
And1: 3,864
Joined: Jun 22, 2022
 

Re: Penalized Regression of WOWY data 

Post#130 » by OhayoKD » Fri Aug 4, 2023 11:32 pm

Doctor MJ wrote:
eminence wrote:
Doctor MJ wrote:So, just wanted to have a post specifically for the 100th percentile surfers. Basically guys who regularly hit that 100th percentile in sustained runs in the 90s and above.

George Mikan
Bill Russell
Wilt Chamberlain
Oscar Robertson
Jerry West
Bill Walton
Larry Bird
Magic Johnson
Michael Jordan
Shaquille O'Neal

Honestly, seems about right. Curious who else is like that when we see more graphs.


My guesses would be Duncan/KG/Dirk/LeBron/CP3/Steph based of the more granular stuff, but who knows.

I would enjoy having some of this stuff in a spreadsheet/table to browse for sure.

Regardless, I do think the onus is finding arguments for the non-100th-percentile guys over the 100th-percentile guys.

Eh...not sure I agree with this.

First off arguments have been made that involve much larger samples and which do not rely on data tied largely to when players happen to miss games:
Spoiler:
Image

(Will circle back to this later)

More importantly, this(just like real RAPM) is not designed to distinguish between 71 or 72 Kareem or 96-98 Jordan. So sorting players into whether they hit the 99th or 100th percentile is kind of missing the point. If you want to compare the highest highs, rapm and rapm approximations are not designed for that as they are curving those highs down.

What matters here is frequency
Image
Kareem is at or higher than the 90th percentile 12 times scoring at the top level for nearly a decade. You might also note that he goes down when 72, 77, and 1980 are introduced. "peaks," which replace down-years in terms of on-court results , but where Kareem doesn't miss any time. There is also srs suppression from 74-onward(you might notice that jordan by comparison is suddenly skyrocketing when srs for all the top teams goes up after being well behind the pace for what is conventially considered his prime)

And yet with all the above, Kareem still is constantly hovering around the top and then adds a bunch of value later.

Yet, applying a very arbitrary filter for one-offs, you've found a way to get him tiered below a shitton of players he looks as good or better than when we do year-by year analysis or focus on concentrated samples of off, and he is very clearly, "by impact" a much more clear cut era #1.

I think the onus is on you to explain why --this-- matters more than all the other arguments/evidence people have made/offered, especially when we're sneaking in MJ, Bird, and Shaq alongside actual(emperical) impact kings like Magic and Russell and more consistent contenders(at least by this metric) like Wilt(who I do not think "seems right" according to your priors).

Also FWIW, I'm not sure putting all your stock in this does all that for Bird because even by the seasonal inputs of a guy who had him higher than Magic, he still fell down to 14th.

We literally have sourced sets for the metric this r-wowy is trying to emulate for Shaq(and Jordan during the years this metric says he peaked) and both fall considerably short of players you've excluded here like Duncan and KG. Shouldn't the onus be on you?
its my last message in this thread, but I just admit, that all the people, casual and analytical minds, more or less have consencus who has the weight of a rubberized duck. And its not JaivLLLL
User avatar
eminence
RealGM
Posts: 16,745
And1: 11,580
Joined: Mar 07, 2015

Re: Penalized Regression of WOWY data 

Post#131 » by eminence » Sat Aug 5, 2023 12:15 am

^Duncan/KG weren't in the doc yet when Doc made his post.
I bought a boat.
User avatar
AEnigma
Assistant Coach
Posts: 4,055
And1: 5,860
Joined: Jul 24, 2022
 

Re: Penalized Regression of WOWY data 

Post#132 » by AEnigma » Sat Aug 5, 2023 12:21 am

Garnett also does not fare especially well here relative to those raw Minnesota WOWY numbers, although again I wonder if that might be a consequence of how on-court results are being weighed (not saying that as a criticism; I imagine you have it set for whatever best correlates to RAPM).
MyUniBroDavis wrote:Some people are clearly far too overreliant on data without context and look at good all in one or impact numbers and get wowed by that rather than looking at how a roster is actually built around a player
OhayoKD
Lead Assistant
Posts: 5,920
And1: 3,864
Joined: Jun 22, 2022
 

Re: Penalized Regression of WOWY data 

Post#133 » by OhayoKD » Sat Aug 5, 2023 12:32 am

AEnigma wrote:Garnett also does not fare especially well here relative to those raw Minnesota WOWY numbers, although again I wonder if that might be a consequence of how on-court results are being weighed (not saying that as a criticism; I imagine you have it set for whatever best correlates to RAPM).

isn't this supposed to be an rapm approximation?

though i guess i'm curious why rapm disagrees so much with moonbeam's method on shaq and kg
its my last message in this thread, but I just admit, that all the people, casual and analytical minds, more or less have consencus who has the weight of a rubberized duck. And its not JaivLLLL
OhayoKD
Lead Assistant
Posts: 5,920
And1: 3,864
Joined: Jun 22, 2022
 

Re: Penalized Regression of WOWY data 

Post#134 » by OhayoKD » Sat Aug 5, 2023 12:33 am

eminence wrote:^Duncan/KG weren't in the doc yet when Doc made his post.

fair enough
its my last message in this thread, but I just admit, that all the people, casual and analytical minds, more or less have consencus who has the weight of a rubberized duck. And its not JaivLLLL
OhayoKD
Lead Assistant
Posts: 5,920
And1: 3,864
Joined: Jun 22, 2022
 

Re: Penalized Regression of WOWY data 

Post#135 » by OhayoKD » Sat Aug 5, 2023 12:35 am

Moonbeam wrote:[

Would it be possible to see a raw data chart like you did with the 80's for the other decades you have done?
its my last message in this thread, but I just admit, that all the people, casual and analytical minds, more or less have consencus who has the weight of a rubberized duck. And its not JaivLLLL
User avatar
Moonbeam
Forum Mod - Blazers
Forum Mod - Blazers
Posts: 10,213
And1: 5,061
Joined: Feb 21, 2009
Location: Sydney, Australia
     

Re: Penalized Regression of WOWY data 

Post#136 » by Moonbeam » Sat Aug 5, 2023 12:55 am

OhayoKD wrote:
AEnigma wrote:Garnett also does not fare especially well here relative to those raw Minnesota WOWY numbers, although again I wonder if that might be a consequence of how on-court results are being weighed (not saying that as a criticism; I imagine you have it set for whatever best correlates to RAPM).

isn't this supposed to be an rapm approximation?

though i guess i'm curious why rapm disagrees so much with moonbeam's method on shaq and kg


I think what might be happening here is some sort of winning bias. Garnett's teams generally capping out at good but not great might limit his ceiling. This could be offset in some cases if players miss a decent number of games to inform a "without" sample, but KG was an iron man in Minnesota for the most part.

Shaq, on the other hand, had better team success so his "on" baseline would generally be higher as a result. I think this might be part of what we are seeing with those dynasty teams Doc asked for --- there often is a bit of a high cluster for those players when those teams had lots of success.

I haven't done anything particularly special to weight the "on" vs. "off" sample. That it correlates moderately well to Cheema's stuff is pretty good, IMO, as Cheema's stuff has a few key differences from what I've done so far:

1. Cheema's RAPM is prior-informed, so it is not shrinking the coefficients toward 0 like I've done, but some other value that is informed by minutes per game adjusted for team quality. Doing this will sort of put the thumb on the scale in favor of players who play a good number of minutes on good teams.

2. Cheema's RAPM also assigns twice the weight to playoff games in comparison to regular season games, but I have weighted everything the same.

What I've put together at this stage is about as pure as it gets in terms of WOWY regression. I imagine if I similarly apply some sort of prior based on MPG adjusted for team quality and incorporated extra weight for playoff games, the results would correlate more strongly. Cheema has said that introducing the prior improves predictive performance, so it's certainly something worth considering, but I'd have to re-code everything as I've done mine using a frequentist approach instead of a Bayesian one.

The next thing I'm looking to do is compare predictive performance using withheld game data for these pure versions vs. some modifications I'm thinking up.
User avatar
Moonbeam
Forum Mod - Blazers
Forum Mod - Blazers
Posts: 10,213
And1: 5,061
Joined: Feb 21, 2009
Location: Sydney, Australia
     

Re: Penalized Regression of WOWY data 

Post#137 » by Moonbeam » Sat Aug 5, 2023 12:56 am

OhayoKD wrote:
Moonbeam wrote:[

Would it be possible to see a raw data chart like you did with the 80's for the other decades you have done?


Sure! Maybe what I can do in the short term is put together a spreadsheet with the top 30 or so players for the different 5-year windows. I've got 3-year windows now, too, if that might be of interest.
OhayoKD
Lead Assistant
Posts: 5,920
And1: 3,864
Joined: Jun 22, 2022
 

Re: Penalized Regression of WOWY data 

Post#138 » by OhayoKD » Sat Aug 5, 2023 1:02 am

Moonbeam wrote:
OhayoKD wrote:
Moonbeam wrote:[

Would it be possible to see a raw data chart like you did with the 80's for the other decades you have done?


Sure! Maybe what I can do in the short term is put together a spreadsheet with the top 30 or so players for the different 5-year windows. I've got 3-year windows now, too, if that might be of interest.

I believe it would be :D
its my last message in this thread, but I just admit, that all the people, casual and analytical minds, more or less have consencus who has the weight of a rubberized duck. And its not JaivLLLL
User avatar
Moonbeam
Forum Mod - Blazers
Forum Mod - Blazers
Posts: 10,213
And1: 5,061
Joined: Feb 21, 2009
Location: Sydney, Australia
     

Re: Penalized Regression of WOWY data 

Post#139 » by Moonbeam » Sat Aug 5, 2023 2:36 am

Here is a spreadsheet with up to 100 positive coefficients for each 5-year window for Ridge, Lasso, and ENet. I'll see if a spreadsheet with the full data is navigable and post separately if so.
OhayoKD
Lead Assistant
Posts: 5,920
And1: 3,864
Joined: Jun 22, 2022
 

Re: Penalized Regression of WOWY data 

Post#140 » by OhayoKD » Sat Aug 5, 2023 2:56 am

Moonbeam wrote:Here is a spreadsheet with up to 100 positive coefficients for each 5-year window for Ridge, Lasso, and ENet. I'll see if a spreadsheet with the full data is navigable and post separately if so.

So basically Russell and Magic look awesome and everyon else looks not so awesome :lol:

Also, uh:
Image


Yikes!
its my last message in this thread, but I just admit, that all the people, casual and analytical minds, more or less have consencus who has the weight of a rubberized duck. And its not JaivLLLL

Return to Player Comparisons