win-loss records analyzed by statistical category
Posted: Sat Feb 2, 2008 12:25 pm
in a bit of bored curiosity (and the desire to avoid studying), i decided to do a linear regression on 2007 nfl win-loss records based on passing defense, rushing defense, passing offense, rushing offense, and turnover differential. the results may surprise you.
predicted wins = 4.8243 - 0.01669(PYA/g) - 0.06095(RYA/g) + 0.04268(PYF/g) + 0.03937(RYF/g) + 0.1276(TO diff)
p-values (i.e., the probability that these coefficients are just statistical "noise")
pass def: 0.244
rush def: 0.0041
pass off: 0.00024
rush off: 0.058
to diff: 0.000096
the saying goes that defense wins championships, so i would have expected turnover differential to have the lowest p-value (meaning it's the most important), with the two defensive measures next. turnover differential indeed has the lowest p-value, but next is passing offense, which surprised me. meanwhile, while passing offense seems to show a stronger effect than rushing offense, passing defense seems to be less important than rushing defense. go figure.
anyway, taking the results at face value (which i know isn't accurate, since i'm omitting several important variables), here are the 2007 season's biggest surprises and disappointments:
surprises:
1. san francisco (+2.9 wins)
2. cleveland (+2.6)
3. new york giants (+2.0)
disappointments:
1. miami (-2.4 wins)
2. atlanta (-2.3)
3. st. louis (-2.1)
new england had 14.9 wins predicted by the model
by now many people have probably stopped reading/caring, but to anyone who's actually made it this far, i hope you found the study entertaining!
predicted wins = 4.8243 - 0.01669(PYA/g) - 0.06095(RYA/g) + 0.04268(PYF/g) + 0.03937(RYF/g) + 0.1276(TO diff)
p-values (i.e., the probability that these coefficients are just statistical "noise")
pass def: 0.244
rush def: 0.0041
pass off: 0.00024
rush off: 0.058
to diff: 0.000096
the saying goes that defense wins championships, so i would have expected turnover differential to have the lowest p-value (meaning it's the most important), with the two defensive measures next. turnover differential indeed has the lowest p-value, but next is passing offense, which surprised me. meanwhile, while passing offense seems to show a stronger effect than rushing offense, passing defense seems to be less important than rushing defense. go figure.
anyway, taking the results at face value (which i know isn't accurate, since i'm omitting several important variables), here are the 2007 season's biggest surprises and disappointments:
surprises:
1. san francisco (+2.9 wins)
2. cleveland (+2.6)
3. new york giants (+2.0)
disappointments:
1. miami (-2.4 wins)
2. atlanta (-2.3)
3. st. louis (-2.1)
new england had 14.9 wins predicted by the model
by now many people have probably stopped reading/caring, but to anyone who's actually made it this far, i hope you found the study entertaining!