Ripp wrote:You don't understand statistics, is the problem....one of the major themes of statistics is that with enough noisy samples, the underlying behavior of the generative process can be understood to exceedingly accurate accuracy by the appropriate algorithm/estimator/technique.
Let me illustrate with a small example. Suppose someone is flipping a coin. One side of the coin is marked 0, the other is marked 1. The coin has some probability "p" of coming up 1. Suppose he flips the coin N times. Assuming independence between coin flips, then you can estimate the number "p" to within accuracy on the order of 1/sqrt{N} (using the obvious estimator that just sets the estimate to the average number of 1s that were in the sequence of flips.)
So in other words, if he flips the coin 1000 times, your estimate of "p" is good to within a tolerance on the order of roughly .0316. If he flips it 1 million times, your estimate of "p" is good to within something on the order of 10^{-3}.
Look guys...when you use your cell phone and talk to your buddy, your voice gets converted to bits, and those are sent through the air. Well, the air is NOISY and random, and corrupts those bits sent! So if people (in this case, those good electrical engineers, statisticians and mathematicians who developed communication theory and information theory) did not know how to extract information from noisy information, lots of cool things you and I take for granted would not work.
So the noise argument is not going to fly, given that SS15 has 9000+ minutes worth of data. A bias argument might (bias is a bit more insidious, and I can illustrate this with a coin-flipping example too), but this certainly will not.
EDIT: And the independence assumption I made above can be dramatically weakened, so long as the correlation between coin flips isn't huge. So don't think it requires an assumption that doesn't show up in the real world.
Look, when it comes right down to it I can be as much of a pedant as the next guy, but you are taking pedantry to new and unheard-of heights.
Now, explain to me this: why is it, when Andrea Bargnani is facing a possession defensively, 55.2% of the coin flips (~1030 samples) comes up "heads" (stops) as opposed to "tails" (scores)? Andrea's doing
something to cause that. Maybe you, like supersub, have to pay $30/year to me to be able to trust my data: if that's the case, I'll certainly give you my mailing address and I'll be happy to bask in my new-found "trusted" status.
At this point, it might behoove you to remind yourself that basketball is not an engineering application: it's a basketball operation solely, and please forgive the tautology. You are attempting to divorce
some stats entirely from the actual game of basketball. And at that point, you're missing a
lot. I also notice you're throwing out a whole lot of accusations about my ignorance of higher maths (information which I volunteered), yet I think pigs will fly the day you'd actually try to take on Dr. Oliver himself about this method because you
know that the outcome wouldn't be good for you. (Call that an appeal to authority if you will, but it's an authority within the relevant field.)
I understand what the stats that have been quoted tell me: that the team has done better with Bargnani than without him. I have no doubt that that is the case: that's not the "noise" I'm concerned with. The "noise" comes in from making an indirect observation about performance ("
the team with Bargnani vs.
the team without Bargnani") and trying to take that to an unwarranted place, that it speaks to Bargnani's individual performance to in any way that we can identify. The reason PDSS is here is to develop that ERA in basketball, so we know
exactly what as individuals they were / were not responsible for how and how that fit in with the team.
At this point, you have stepped over that line and become one of those people who try to do basketball analysis solely by stats, and I'm sorry - that's not how basketball analysis is done and you don't have much to say that means much. And what you have said, in the main, about PDSS - including your original post - has been entirely baseless as well.