Moderators: tsherkin, Doctor MJ, CellarDoor
#61 Re: Why I'm not a WP fan Wed Feb 9, 2011 12:07 pm by sp6r=underrated
Paydro70 wrote: The notion that 5 Tyson Chandlers (or his smallman equivalent) would win 80 games just reflects a complete lack of understanding of how basketball works.
PG: Kidd (8.4 wins) SG: Fields (9.8 wins) SF: Battier (5.1 wins) PF: Humphries (9.5 wins) C: Chandler (8.6 wins) 41.4 wins balling
Give me a R! Give me a A! Give me a P! Give me a M! RAPM! RAPM! RAPM! It is the only stat you need to rank players. viewtopic.php?f=64&t=1305193
 sp6r=underrated
 RealGM
 Posts: 10,182
 And1: 854
 Joined: Jan 20, 2007
#62 Re: Why I'm not a WP fan Wed Feb 9, 2011 12:07 pm by mysticbb
Sleepy51, even if he would account for defense better, he wouldn't improve his model enough. As I pointed out the team adjustment brings his correlation coefficient to 0.94 and without that it is 0.71. He is justifying his model with that high correlation to winning. The defensive adjustment is basically justified via a residual. Well, Berri is not directly using a residual, but overall it is just that. It is also not accurate to say he got his whole model via regression, because he only did two regressions. The first regression he did was to determine A, B and C for the following formula:
Win% = A*ORTG+B*DRTG+C
A, B and C are presented on his page as the coefficients next to offensive efficiency, defensive efficiency and constant term.
What he is now proposing are those two formulas for PA and PE. At this point he is not using a regression to determine the marginal values, he is using league average values. The differentiation gives us:
Marginal value PTS = A/PE Marginal value PE = A*(PTS/PE^2) Marginal value DPTS = B/PA Marginal value PA = B*(DPTS/PA^2)
The values for PTS, PE, DPTS and PA are the league average values in his data set. If you use a different data set you will get slightly different values.
In the next step he is using the marginal values and is assigning them accordingly to boxscore stats which are related to PTS, PE, PA and DPTS. You can easily see that by the numbers for free throws. (10.47)*0.032 = 0.017 and 0.47*0.032 = 0.015. Well, by chosing 0.47 as the factor for free throws he set the break even point for free throw shooting to 47 ft%. Nicely done. That just means a player can shoot 48 ft% on his free throws and he will help his team winning more than a guy who is shooting 49 fg% on his two point field goals. In fact the latter will hurt his team while the other will help.
At the end he is coming up with the conclusion that a player has to shoot over 50% from the field on his two point shots to help his team winning games. That is above the league average.
Where does the problem comes from? The formula for possession employed works for the overall team, but not for the individual player. Adding up FGA+0.47*FTA+TOORB will not give us a good estimation of the possessions a player used. To understand that: If we applying this formula to a player like Love, we are getting right now a 152.8 ORtg for him. He scored 1111 points with 777 FGA, 351 FTA, 120 TO and 247 ORB. According to Oliver he has 123 ORtg. Now, if we look at a guy like Dirk Nowitzki who is scoring at a higher rate and with a higher efficiency while turning the ball over less, we get 114.4 ORtg, while according to Oliver he has 118. Or my new favorite upcoming franchise player Kris Humphries has 141.7 ORtg, if we apply Berri's formula for offensive efficiency.
As you can see the problem is not only defense, the problem is that the formulas Berri is using can't be applied on players. They will give you absurd numbers for guys who are getting offensive rebounds. Well, if you add those numbers up for each player on each team, it will for sure give you exactly the teams overall number, but that doesn't mean that the distribution among players is correct.
Btw: If we are using the same methodology as they used in the last GSW blog entry, we get 23.0 wins via WS/48 and 22.7 wins via my PRA as the prediction. Both are way closer to the reality than WP48. Not surprising at all. Both are also more consistent from year to year, the same goes for PER. I could calculate the average PER for the teams over time and make a regression to estimate the linear formula WIn% = A*PER+B, but well, I have the feeling it is also closer to the reality anyway.
 mysticbb
 General Manager
 Posts: 7,669
 And1: 409
 Joined: May 28, 2007
#63 Re: Why I'm not a WP fan Wed Feb 9, 2011 1:27 pm by Sleepy51
mysticbb wrote: At this point he is not using a regression to determine the marginal values, he is using league average values. The differentiation gives us:
Well then he is a chowderhead. Overall, I think I was more focusing on the point that ANY regression analysis model for understanding/predicting individual player impacts on team results would benefit from including differential defensive stats rather than boxscore team defensive stats. I would argue that there is a qualitative difference between how those two data sets can function in any model due to "how basketball works" issues.
I love basketball. I love watching it. I love playing it and I love talking about it. Unfortunately I really don't understand most of what I am seeing so when I start going on and on about it pay little or no attention.
 Sleepy51
 Forum Mod  Warriors
 Posts: 31,148
 And1: 157
 Joined: Jun 28, 2005
#64 Re: Why I'm not a WP fan Wed Feb 9, 2011 2:37 pm by mysticbb
Sleepy51 wrote:Well then he is a chowderhead.
Well, here is how you can calculate the marginal value for points scored (MVP) by using this seasons data: MVP = A/PE where A is 3.442 and PE = FGA + 0.47* FTA + TO  ORB The values for FGA, FTA, TO and ORB are per game numbers for the league average, but we can use the total average numbers right now and multiply that with the amount of games. Currently an average team has 4160 FGA, 1267 FTA, 738 TO and 561 ORB. They played 51 games. That gives us: MVP = A/PE*51 MVP = 3.442 / (4160 + 0.47*1267 + 738  561) *51 MVP = 0.036 He gets 0.032, because he is using a different data set. That's all. What Berri NEVER showed is that the formulas can be used to evaluate players. He showed that it is doing a good job at "predicting" wins in hindsight, but that's all. But it is not doing a better job at this as scoring margin. Which isn't a suprise at all, because that's what he is trying to reproduce.
 mysticbb
 General Manager
 Posts: 7,669
 And1: 409
 Joined: May 28, 2007
#65 Re: Why I'm not a WP fan Wed Feb 9, 2011 3:11 pm by floppymoose
Paydro70 wrote: The notion that 5 Tyson Chandlers (or his smallman equivalent) would win 80 games just reflects a complete lack of understanding of how basketball works.
In fairness to WP, this will be true of any stat whose metric is some form of win shares. These win shares are situational. If you look at the 5 best centers in the league they should be generating tons of wins. That doesn't mean that playing all 5 of them on a team at the same time is a winning strategy, and that fact doesn't in itself make the metric bad. The metric is about how much are these players helping you *in the context of their team, minutes, player combinations, etc*. If we could somehow fix WP to assign credit more accurately, it would still fail the test you apply above, because that's not really a legitimate test.
 floppymoose
 Forum Mod  Warriors
 Posts: 39,609
 And1: 139
 Joined: Jun 22, 2003
#66 Re: Why I'm not a WP fan Wed Feb 9, 2011 7:18 pm by floppymoose
Thought experiment for WP, part I:
take the box score and strip out all the info except minutes, points and FGA, and then do the same regressions and team adjustments that WP does to tune the results.
How well can this be made to correlate with wins in past seasons?
part II:
Now do the same thing again, but take out FGA as well.
Mostly I'm wondering if you can get an impressive correlation with wins this way. It would no longer amount to counting up possessions and then factoring in defensive and offensive efficiency, because we've stripped too much data out to achieve that. But it might correlate pretty well anyway, which would be interesting if true.
I'm thinking that counting up the points on the team, plus having the "team defensive rating" would basically be like having scoring differential, which is known to correlate well with wins.
 floppymoose
 Forum Mod  Warriors
 Posts: 39,609
 And1: 139
 Joined: Jun 22, 2003
#67 Re: Why I'm not a WP fan Wed Feb 9, 2011 7:34 pm by mysticbb
Used data set from databasebasketball.com from 1979/80 to 2009/10 Part I: Points and FGA pace and minutes adjusted + defensive adjustment:  Code: Select all
Model Summary(b) Change Statistics Model R R Square Adjusted R Square Std. Error of the Estimate 1 ,970a ,941 ,941 ,0377461 a. Predictors: (Constant), DEF_A, FGA_P, PTS_P b. Dependent Variable: Win%
Win% = 2.770.002*FGA_P+0.155*PTS_P+0.031*DEF_A Part II: Only points adjusted for pace and minutes + defensive adjustment term:  Code: Select all
Model Summary(b) Change Statistics Model R R Square Adjusted R Square Std. Error of the Estimate 1 ,970a ,941 ,941 ,0377361 a. Predictors: (Constant), DEF_A, PTS_P b. Dependent Variable: Win%
Win% = 2.805+0.155*PTS_P+0.031*DEF_A Well, the DEF_A is a small adjustment that will not change the ranking of the players much .... Took me 10 minutes to build TWO models which have a higher correlation to winning than WP.
 mysticbb
 General Manager
 Posts: 7,669
 And1: 409
 Joined: May 28, 2007
#68 Re: Why I'm not a WP fan Wed Feb 9, 2011 7:38 pm by floppymoose
nice. do you know how that compares to regular WP?
And the other part fo this I wonder about is how the DEF_A is calculated. If it was tuned in some way to make WP the best it could be, perhaps it needs to be retuned in both of the above examples.
 floppymoose
 Forum Mod  Warriors
 Posts: 39,609
 And1: 139
 Joined: Jun 22, 2003
#69 Re: Why I'm not a WP fan Wed Feb 9, 2011 7:41 pm by mysticbb
floppymoose wrote:I'm thinking that counting up the points on the team, plus having the "team defensive rating" would basically be like having scoring differential, which is known to correlate well with wins.
Correct. floppymoose wrote:nice. do you know how that compares to regular WP?
Uh, I had a spreadsheet for WP, but I deleted it, because it wasn't worth much anyway. I probably could make a new one and compare the results for players. But well, I doubt I make it today. If anyone has the desire, the model is there. floppymoose wrote:And the other part fo this I wonder about is how the DEF_A is calculated. If it was tuned in some way to make WP the best it could be, perhaps it needs to be retuned in both of the above examples.
Well, I took 106.5DRTG = DEF_A, which means I just compared the team defensive rating to the average offensive/defensive rating from 1979 to 2010.
 mysticbb
 General Manager
 Posts: 7,669
 And1: 409
 Joined: May 28, 2007
#70 Re: Why I'm not a WP fan Wed Feb 9, 2011 7:49 pm by floppymoose
mysticbb wrote:Took me 10 minutes to build TWO models which have a higher correlation to winning than WP.
So this part makes me think that while you have deleted your WP spreadsheet, you remember the results well enough to claim that this correlates better?
 floppymoose
 Forum Mod  Warriors
 Posts: 39,609
 And1: 139
 Joined: Jun 22, 2003
#71 Re: Why I'm not a WP fan Wed Feb 9, 2011 7:51 pm by floppymoose
And it also doesn't look like you used minutes? I ask because I'm wondering about an apples to apples comparison with WP48.
 floppymoose
 Forum Mod  Warriors
 Posts: 39,609
 And1: 139
 Joined: Jun 22, 2003
#72 Re: Why I'm not a WP fan Wed Feb 9, 2011 8:01 pm by mysticbb
floppymoose wrote:And it also doesn't look like you used minutes? I ask because I'm wondering about an apples to apples comparison with WP48.
Well, didn't saw the minutes part, but it will not change much anyway, because I controll the model basically via ORtg and DRtg, thus the minutes and FGA will get skipped. And if you want to go by minutes, you calculate the win%, which means that is per game (or basically per 48 minutes anyway). But here we go with minutes:  Code: Select all
Model Summary Model R R Square Adjusted R Square Std. Error of the Estimate 1 ,970a ,941 ,941 ,0377689 a. Predictors: (Constant), FGA_P, MIN, PTS_P, DEF_A
 Code: Select all
Coefficients(a) Unstandardized Coefficients Standardized Coefficients Model B Std. Error Beta t Sig. 1 (Constant) 2,775 ,394 7,042 ,000 MIN 2,634E7 ,000 ,000 ,013 ,989 PTS_P ,155 ,002 ,751 88,512 ,000 DEF_A ,031 ,000 ,695 81,687 ,000 FGA_P ,002 ,003 ,006 ,749 ,454 a. Dependent Variable: Win%
I have the correlation coefficients in my memory. 0.95 reported Berri (I got 0.93 with his model), and for the model without the team adjustment it was 0.72.
 mysticbb
 General Manager
 Posts: 7,669
 And1: 409
 Joined: May 28, 2007
#73 Re: Why I'm not a WP fan Wed Feb 9, 2011 8:15 pm by floppymoose
thanks mytic
 floppymoose
 Forum Mod  Warriors
 Posts: 39,609
 And1: 139
 Joined: Jun 22, 2003
#74 Re: Why I'm not a WP fan Wed Feb 9, 2011 8:42 pm by mysticbb
Here is the current TOP of the FMguys:  Code: Select all
Player Age Tm Q FM48 FM Kobe Bryant 32 LAL 1 0.676 24.2 LeBron James 26 MIA 1 0.594 22.7 Kevin Durant 22 OKC 1 0.592 22.6 Dwyane Wade 29 MIA 1 0.593 20.9 Derrick Rose 22 CHI 1 0.525 19.9 Kevin Martin 27 HOU 1 0.591 19.3 Amare Stoudemire 28 NYK 1 0.501 19.0 Dwight Howard 25 ORL 1 0.453 16.8 Dirk Nowitzki 32 DAL 1 0.567 16.7 Russell Westbrook 22 OKC 1 0.427 16.0 Monta Ellis 25 GSW 1 0.375 15.8 Carmelo Anthony 26 DEN 1 0.484 15.6 Blake Griffin 21 LAC 1 0.404 15.5 Eric Gordon 22 LAC 1 0.450 14.5 LaMarcus Aldridge 25 POR 1 0.343 14.3 Deron Williams 26 UTA 1 0.375 14.1 David West 30 NOH 1 0.368 13.6
I called it FM48 and FM in honor to floppymoose. :)
 mysticbb
 General Manager
 Posts: 7,669
 And1: 409
 Joined: May 28, 2007
#76 Re: Why I'm not a WP fan Wed Feb 9, 2011 8:54 pm by mysticbb
floppymoose wrote:Kevin Martin means FM sucks as a player ranker. :D
Well, it is the ranking of the biggest volume scorers in the NBA. It is doing pretty much a perfect job at this. :) But overall it just shows that you can control the correlation via a defensive adjustment and can even get a "good" ranking out of this without using more informations than minutes, pace, points and DRtg. 0.97 correlation coefficient to winning, why should we questioning "our" model?
 mysticbb
 General Manager
 Posts: 7,669
 And1: 409
 Joined: May 28, 2007
#77 Re: Why I'm not a WP fan Wed Feb 9, 2011 8:58 pm by ElGee
mysticbb wrote:Here is the current TOP of the FMguys:  Code: Select all
Player Age Tm Q FM48 FM Kobe Bryant 32 LAL 1 0.676 24.2 LeBron James 26 MIA 1 0.594 22.7 Kevin Durant 22 OKC 1 0.592 22.6 Dwyane Wade 29 MIA 1 0.593 20.9 Derrick Rose 22 CHI 1 0.525 19.9 Kevin Martin 27 HOU 1 0.591 19.3 Amare Stoudemire 28 NYK 1 0.501 19.0 Dwight Howard 25 ORL 1 0.453 16.8 Dirk Nowitzki 32 DAL 1 0.567 16.7 Russell Westbrook 22 OKC 1 0.427 16.0 Monta Ellis 25 GSW 1 0.375 15.8 Carmelo Anthony 26 DEN 1 0.484 15.6 Blake Griffin 21 LAC 1 0.404 15.5 Eric Gordon 22 LAC 1 0.450 14.5 LaMarcus Aldridge 25 POR 1 0.343 14.3 Deron Williams 26 UTA 1 0.375 14.1 David West 30 NOH 1 0.368 13.6
I called it FM48 and FM in honor to floppymoose.
LOL. FM48 destroys WP48. Kobe, LBJ, Durant, Wade and Rose vs. Love, Paul, Howard, LBJ, Randolph. Kevin Martin is Floppy's biggest outlier. WP is, um, Kris Humphries.
 ElGee
 Analyst
 Posts: 3,603
 And1: 164
 Joined: Mar 7, 2010
#78 Re: Why I'm not a WP fan Wed Feb 9, 2011 8:58 pm by floppymoose
mysticbb wrote:But overall it just shows that you can control the correlation via a defensive adjustment and can even get a "good" ranking out of this without using more informations than minutes, pace, points and DRtg. 0.97 correlation coefficient to winning, why should we questioning "our" model?
Indeed. That was what I suspected, and why I proposed the "thought experiment". Which you turned into an actual test, and verified my suspicions. You get the awesome award!
 floppymoose
 Forum Mod  Warriors
 Posts: 39,609
 And1: 139
 Joined: Jun 22, 2003
#80 Re: Why I'm not a WP fan Thu Feb 10, 2011 1:11 am by Idunkon1stdates
Did anyone notice that dberri replied? In his response, he took a shot at floppymoose's username, and then said correlation to wins isn't everything Wins Produced is about  in fact, other factors went into producing it (which he doesn't mention, but he is referring to the regressions).
He has since deleted his reply, probably because he realized how passiveaggressive it sounded and also because most of his drones follow his model because, as he likes to repeat every other day, it explains 90  95% of wins. Clearly, FM48 explaing 97% of wins threatens his cred. Better to "take the high road" and ignore the riffraff so the Wages of Wins myth can continue to be perpetuated.
 Idunkon1stdates
 Senior
 Posts: 538
 And1: 1
 Joined: Feb 20, 2008
Return to Statistical Analysis
Users browsing this forum: No registered users

