Shot charts are a great way to analyze FGA distribution, but they can also be a little overwhelming when analyzing large datasets. For this reason they are generally used to compare only a handful of players at a time. Because shot distribution is essential to understanding scoring utility/how a player affects opposing defenses, I wanted to come up with a system that would simplify the data and make it more portable in discussion.
I discovered that FGA's from the following areas were very evenly distributed:
- within 5ft
- 5+ ft (inside the 3pt line)
- 3pt
For my purposes, I coded the sections as I (inside), M (midrange) and O (outside). The order of each letter determines the %volume of FGA from highest to lowest. Using data from the 18-19 season, here are some examples for each classification:
Spoiler:
After applying this method to 118 players who averaged 10+ FGA over 30+ games, I came up with the following distributions:
Spoiler:
Or, another way to look at it:
Spoiler:
I was happy with this first step, but I wanted to go one step further to qualify the distribution. I decided to use lower case letters for the base model and, for player's that are super reliant on one area of the court (55.6% or greater), signify that using a capital letter. If we see that Steph Curry is an "Omi", that tells us his distribution is even more outside-oriented than a standard omi.
As a final step, I wanted to signify when one area of a player's arsenal is either negligible or non-existent (11.1% or less). I decided the best way to do this would be to remove the corresponding letter from the player's classification. This helps to separate a guy like Drummond- an "Im"- from imo's like Embiid who incorporate the 3pt shot into their game.
Other ideas that I entertained and didn't follow through with:
- somehow incorporating eFG% from each area (thought this would overcomplicate things)
- using a "." between letters to indicate a gap that is 20+% (thought this would be redundant with my capital letter/letter removal ideas)
After reading my methodology, I have 2 questions:
- Would classification like this be useful/of interest to you?
- If so, do you have any recommendations on how to improve it?