Simplifying Shot Chart Data

Moderator: Doctor MJ

Would the base concept be useful/of interest to you?

Poll ended at Tue Sep 17, 2019 2:08 am

No votes
Total votes: 1

User avatar
Posts: 11,536
And1: 12,084
Joined: Jun 17, 2006

Simplifying Shot Chart Data 

Post#1 » by GeorgeMarcus » Tue Sep 3, 2019 1:45 am

I came up with a system of classification that has been useful in my research, and I'm wondering if other basketball enthusiasts would find it useful as well. I encourage any constructive criticism you might have, even if that means disbanding the concept altogether. Here goes:

Shot charts are a great way to analyze FGA distribution, but they can also be a little overwhelming when analyzing large datasets. For this reason they are generally used to compare only a handful of players at a time. Because shot distribution is essential to understanding scoring utility/how a player affects opposing defenses, I wanted to come up with a system that would simplify the data and make it more portable in discussion.

I discovered that FGA's from the following areas were very evenly distributed:
- within 5ft
- 5+ ft (inside the 3pt line)
- 3pt

For my purposes, I coded the sections as I (inside), M (midrange) and O (outside). The order of each letter determines the %volume of FGA from highest to lowest. Using data from the 18-19 season, here are some examples for each classification:
OMI - Klay Thompson, Lauri Markkanen, JJ Redick
OIM - James Harden, Damian Lillard, Blake Griffin
MOI - Kevin Durant, CJ McCollum, Marc Gasol
MIO - DeMar DeRozan, Nikola Vucevic, LeMarcus Aldridge
IOM - LeBron James, DeMarcus Cousins, Eric Bledsoe
IMO - Anthony Davis, Ben Simmons, Montrezl Harrell

After applying this method to 118 players who averaged 10+ FGA over 30+ games, I came up with the following distributions:
OMI: 24
OIM: 21
MOI: 16
MIO: 21
IOM: 15
IMO: 21

Or, another way to look at it:
O: 45/31/42
M: 37/45/36
I: 36/42/40

I was happy with this first step, but I wanted to go one step further to qualify the distribution. I decided to use lower case letters for the base model and, for player's that are super reliant on one area of the court (55.6% or greater), signify that using a capital letter. If we see that Steph Curry is an "Omi", that tells us his distribution is even more outside-oriented than a standard omi.

As a final step, I wanted to signify when one area of a player's arsenal is either negligible or non-existent (11.1% or less). I decided the best way to do this would be to remove the corresponding letter from the player's classification. This helps to separate a guy like Drummond- an "Im"- from imo's like Embiid who incorporate the 3pt shot into their game.

Other ideas that I entertained and didn't follow through with:
- somehow incorporating eFG% from each area (thought this would overcomplicate things)
- using a "." between letters to indicate a gap that is 20+% (thought this would be redundant with my capital letter/letter removal ideas)

After reading my methodology, I have 2 questions:
- Would classification like this be useful/of interest to you?
- If so, do you have any recommendations on how to improve it?
"In my hood, bullies get bullied." - Zach Randolph

The Legend of George Marcus

Return to Statistical Analysis