ImageImageImage

538 on the Celtics odds and their turnaround

Moderators: bisme37, canman1971, Darthlukey, Shak_Celts, Froob, Parliament10, shackles10, snowman

User avatar
bisme37
Forum Mod - Celtics
Forum Mod - Celtics
Posts: 19,387
And1: 56,586
Joined: May 24, 2014
 

Re: 538 on the Celtics odds and their turnaround 

Post#21 » by bisme37 » Sun Jun 5, 2022 6:10 pm

Someone on reddit made a post giving a good breakdown on how FiveThirtyEight calculates their RAPTOR metric, comes up with their NBA predictions, and why they love the C's so much.

Spoiler:
Past couple of weeks there’s been tons of talk online and in the media about different models saying the Celtics are the heavy favorites to win the finals, and it seems like no one in the media has any idea what they're talking about when referencing them. So, since I’m one of the "nerds", I thought I’d spend some time on the off day trying to explain as simply as I can what these models are actually doing for anyone who is interested. I’m only a graduate student so I’m far from an expert, but hopefully can explain it well enough to make you more knowledgable on them than dumbasses like Felger and Mazz talking about “the nerds running simulations”.

The driving force behind 538s predictions is called RAPTOR, which a model they developed for individual player ratings. These ratings represent points scored and points allowed by the player per 100 possessions. The way they calculate these ratings is through something called regression. Regression in its simplest form can be thought of as the line of best fit added to an (x,y) scatter plot. These ratings are essentially calculated the same way with y=offensive/defensive rating, and x=all of the variables they include in their model.

The offensive rating uses 15 variables. These include simple box score stats, but mainly consist of more advanced ones such as enhanced assists (assists adjusted for quality of shot attempt), and how close the nearest defender was on their shot attempts. The defensive rating uses 11 variables such as distance traveled (for perimeter players), and enhanced rebounds (adjusted for how contested they were). Finally once these two ratings are calculated, they are adjusted for how the team does when they are on vs off the court to account for how important they are to the team as a whole, and summed into 1 rating. These ratings are adjusted after every game to account for new data.

Calculating the individual player ratings seems to be the only real machine learning aspect of the model. The rest for the most part is just basic math. To calculate the teams ratings, they simply find the weighted sum of the players ratings based on how much they play.

A teams win probability for a given opponent is done by calculating the difference between their ratings (+/- adjustments for things like home court and how much rest they have) and plug that into an equation (1/(10^(-Rating differential /400)+1) (no clue how they derived this). These probabilities are multiplied together to see the probability a team wins in 4/5/6/7 games. Finally summing all the possibilities gives us the probability for each team winning the series.

Right now the team rating differentials gives the Celtics a 57% chance to win in SF and a 79% chance to win in Boston. Therefore, the probabilities work out as followed:

https://preview.redd.it/ochybfb6oo391.png?width=512&format=png&auto=webp&s=cd27683364ff761c369e91ae4059dceddd1926e4
I oversimplified and skipped over a lot of minor adjustments they make, but that’s basically it. Regression to calculate player ratings, combining them into a team rating, and comparing the two teams ratings. Though it is done slightly differently for predictions with a longer time frame.

So why does the model love Celtics? Many reasons, but mainly because statistically we have a lot of really good two way players. The model has a bias towards perimeter players over bigs (as does modern day NBA), and we have a lot of really good perimeter defenders. For bigs they favor those who also have solid offense, especially the ability to shoot 3s, which gives Horford a big boost. Ultimately, we’re favored because a lot of our players have the stats they’re looking for that gives them great ratings. We have 6 players in their top 40 (Tatum (2), Horford (10), White (11), Smart (17), Brown (22), and R Williams (40)), while the Warriors only have 2 (Curry (6), Wiggins (26)).

So should you care about what these models say? Do the Celtics really have a 92% chance to win the finals? Ultimately, the answer is maybe. In my opinion sports forecasting is somewhat of a pseudoscience cause its based on significantly smaller amounts of data vs things like targeted advertising, and it's essentially impossible to predict how an individual player will perform in one game (tbf I haven't done much reading on the actual research so I'm only guessing). Even if it is accurate, 8% is still a pretty large probability and far from impossible.

With that said, in my (biased) opinion these models are 1000x better and more informative than ex players on tv telling us media cliches and their personal, statistically baseless, gut opinions. They tell us which team is objectively better (according to the variables used). Again, I’m only a 24 year old data science grad student so far from an expert, but hopefully if anyone is still reading they now know a bit more about these “analytic models” the media loves to make fun of.

tldr: 538 does regression with fancy variables to calculate players offensive and defensive ratings, adds them up to get a team rating, and compares the ratings. Celtics have many statistically better players, but we have no idea if the model is accurate. Simply means we have a statistically better team based on their variables, and if we played the series 1000s of times, we'd very likely win the majority of them.

Edit: For those wondering what their regression coefficients are or how exactly they train it, the answer is unknown. It's their IP so they wouldn't want to publicize super detailed info on how to replicate it.

Sources:
https://fivethirtyeight.com/features/how-our-raptor-metric-works/


https://old.reddit.com/r/bostonceltics/comments/v50lxm/how_538s_model_works_and_why_it_loves_the_celtics/
Gant
General Manager
Posts: 9,535
And1: 11,322
Joined: Mar 16, 2006

Re: 538 on the Celtics odds and their turnaround 

Post#22 » by Gant » Sun Jun 5, 2022 7:46 pm

Follow up 538 article on the Celtics and Warriors, which was written just before Game 1.

Ahead of today’s Game 1 of the NBA Finals, let’s get one thing out of the way: Our forecast model loves, loves, loves the Boston Celtics. (Or maybe it just hates the Golden State Warriors.) Either way, the model gives Boston an 80 percent chance of winning the championship over Golden State, in very stark contrast to the betting markets — which immediately installed the Warriors as pre-series favorites last week. Based on the odds from Caesars Sportsbook, which list Golden State as -160 (and Boston as +140), we can infer that the bookmakers consider the Warriors a 60 percent favorite to win the title.

So something has to give between the two predictions. And if you ask for my opinion, yes, I think our forecast is too bullish on the Celtics. However, it also seems like the markets are too bearish on them — or, again, too bullish on Golden State. Maybe the conventional wisdom is just stuck in the mid-to-late-2010s Warriors dynasty era. Who knows? But in 2022, most indicators from throughout the season suggest that the Celtics are genuinely a better team than the Warriors.


https://fivethirtyeight.com/features/we-might-be-overrating-the-celtics-but-youre-probably-underrating-them/

Return to Boston Celtics