PostgreSQL database for NBA stats

Moderator: Doctor MJ

Saints14
Assistant Coach
Posts: 3,771
And1: 5,456
Joined: Jul 19, 2013
 

PostgreSQL database for NBA stats 

Post#1 » by Saints14 » Thu Jan 23, 2020 12:38 pm

Hi all, this is a bit of self promotion so please feel free to remove if it isn’t appropriate here, but I thought there’d be some users that would be interested.

I've been doing basketball analytics work in various capacities for the past 5 years or so, and like anyone else in the field understand how frustrating and time consuming data prep can be and how few technical resources there are available to the public. To solve for this a few years ago I started building a database for NBA and NCAA as a personal project to assist with my analyses, and recently realized that my work could support other technical folks in their endeavors as well.

The database contains box scores, game summaries, play-by-play data, shot charts and aggregated stats from ESPN beginning in 2009-10, and is updated nightly. If this is something you'd be interested in getting access to, I set up a Patreon page (https://www.patreon.com/SportsDataWarehouse) or you can send me a PM on here (I'm willing to do "free trials").

Definitely open to any feedback as well!
dho4ever
Rookie
Posts: 1,072
And1: 760
Joined: Apr 20, 2011

Re: PostgreSQL database for NBA stats 

Post#2 » by dho4ever » Sat Feb 8, 2020 12:08 am

What does your PostGres DB do that I can't get from this https://www.basketball-reference.com/play-index/
Saints14
Assistant Coach
Posts: 3,771
And1: 5,456
Joined: Jul 19, 2013
 

Re: PostgreSQL database for NBA stats 

Post#3 » by Saints14 » Mon Feb 17, 2020 3:47 pm

dho4ever wrote:What does your PostGres DB do that I can't get from this https://www.basketball-reference.com/play-index/


It allows you to query directly from the database vs having to interact with a web interface. No downloading .csv files, direct python/R integration for technical folks looking to work on basketball analytics projects.
fianchetto
Starter
Posts: 2,176
And1: 2,997
Joined: Apr 17, 2016
 

Re: PostgreSQL database for NBA stats 

Post#4 » by fianchetto » Wed Jul 15, 2020 1:56 am

Saints14 wrote:
dho4ever wrote:What does your PostGres DB do that I can't get from this https://www.basketball-reference.com/play-index/


It allows you to query directly from the database vs having to interact with a web interface. No downloading .csv files, direct python/R integration for technical folks looking to work on basketball analytics projects.


do you have a public HTTP API or plan to? In case you don't know what that is (no offence intended, just don't know your background). it's a web address that can be sent database requests programmatically and is language-agnostic.

I'd be happy to build one for you if you want any collaborators. I would do this because a public API with this data would be useful for me.
“If I told you that a flower bloomed in a dark room, would you trust it?”
Saints14
Assistant Coach
Posts: 3,771
And1: 5,456
Joined: Jul 19, 2013
 

Re: PostgreSQL database for NBA stats 

Post#5 » by Saints14 » Wed Jul 15, 2020 11:12 am

fianchetto wrote:
Saints14 wrote:
dho4ever wrote:What does your PostGres DB do that I can't get from this https://www.basketball-reference.com/play-index/


It allows you to query directly from the database vs having to interact with a web interface. No downloading .csv files, direct python/R integration for technical folks looking to work on basketball analytics projects.


do you have a public HTTP API or plan to? In case you don't know what that is (no offence intended, just don't know your background). it's a web address that can be sent database requests programmatically and is language-agnostic.

I'd be happy to build one for you if you want any collaborators. I would do this because a public API with this data would be useful for me.


I don't have an API, there seem to be a number of resources out there to pull data straight into python or R using an API:

https://medium.com/clarktech-sports/python-sports-analytics-made-simple-part-2-40e591a7f3db
https://pypi.org/project/nba-api/
https://www.playingnumbers.com/2019/12/how-to-get-nba-data-using-the-nba_api-python-module-beginner/

Not sure if these get you what you need. If they don't, I'm open to collaboration :)
fianchetto
Starter
Posts: 2,176
And1: 2,997
Joined: Apr 17, 2016
 

Re: PostgreSQL database for NBA stats 

Post#6 » by fianchetto » Sun Aug 9, 2020 3:00 am

Saints14 wrote:
fianchetto wrote:
Saints14 wrote:
It allows you to query directly from the database vs having to interact with a web interface. No downloading .csv files, direct python/R integration for technical folks looking to work on basketball analytics projects.


do you have a public HTTP API or plan to? In case you don't know what that is (no offence intended, just don't know your background). it's a web address that can be sent database requests programmatically and is language-agnostic.

I'd be happy to build one for you if you want any collaborators. I would do this because a public API with this data would be useful for me.


I don't have an API, there seem to be a number of resources out there to pull data straight into python or R using an API:

https://medium.com/clarktech-sports/python-sports-analytics-made-simple-part-2-40e591a7f3db
https://pypi.org/project/nba-api/
https://www.playingnumbers.com/2019/12/how-to-get-nba-data-using-the-nba_api-python-module-beginner/

Not sure if these get you what you need. If they don't, I'm open to collaboration :)


Where do you get your data? Because a program can be written (relatively) easily to gather data from sites like basketball-reference.com and nba.com. It would have to be actively maintained but it can be done accurately.

An http API would be language agnostic, meaning you could programmatically (with code) get data to it from with whatever language or tool you're using.

It would be a game changer for anyone who needs the data for any real analysis because it would provide them with structured data that would be available readily programmatically.

If you still maintain the project let me know. It wouldn't take too much effort on my end
“If I told you that a flower bloomed in a dark room, would you trust it?”
Saints14
Assistant Coach
Posts: 3,771
And1: 5,456
Joined: Jul 19, 2013
 

Re: PostgreSQL database for NBA stats 

Post#7 » by Saints14 » Mon Aug 10, 2020 1:31 pm

fianchetto wrote:
Saints14 wrote:
fianchetto wrote:
do you have a public HTTP API or plan to? In case you don't know what that is (no offence intended, just don't know your background). it's a web address that can be sent database requests programmatically and is language-agnostic.

I'd be happy to build one for you if you want any collaborators. I would do this because a public API with this data would be useful for me.


I don't have an API, there seem to be a number of resources out there to pull data straight into python or R using an API:

https://medium.com/clarktech-sports/python-sports-analytics-made-simple-part-2-40e591a7f3db
https://pypi.org/project/nba-api/
https://www.playingnumbers.com/2019/12/how-to-get-nba-data-using-the-nba_api-python-module-beginner/

Not sure if these get you what you need. If they don't, I'm open to collaboration :)


Where do you get your data? Because a program can be written (relatively) easily to gather data from sites like basketball-reference.com and nba.com. It would have to be actively maintained but it can be done accurately.

An http API would be language agnostic, meaning you could programmatically (with code) get data to it from with whatever language or tool you're using.

It would be a game changer for anyone who needs the data for any real analysis because it would provide them with structured data that would be available readily programmatically.

If you still maintain the project let me know. It wouldn't take too much effort on my end


My data is scraped nighty from ESPN, so it parses pretty much everything you'd find in a game recap (player/team boxscores, play-by-play, shot charts) and puts it into the database. Basketball reference is great for ad-hoc types of studies but a pain in the butt for real projects, especially anything ongoing - querying from database or pulling from an API is much easier and more flexible.

I'm still maintaining the project - will shoot you a DM
fianchetto
Starter
Posts: 2,176
And1: 2,997
Joined: Apr 17, 2016
 

Re: PostgreSQL database for NBA stats 

Post#8 » by fianchetto » Sun Aug 23, 2020 3:20 am

Saints14 wrote:
fianchetto wrote:
Saints14 wrote:
I don't have an API, there seem to be a number of resources out there to pull data straight into python or R using an API:

https://medium.com/clarktech-sports/python-sports-analytics-made-simple-part-2-40e591a7f3db
https://pypi.org/project/nba-api/
https://www.playingnumbers.com/2019/12/how-to-get-nba-data-using-the-nba_api-python-module-beginner/

Not sure if these get you what you need. If they don't, I'm open to collaboration :)


Where do you get your data? Because a program can be written (relatively) easily to gather data from sites like basketball-reference.com and nba.com. It would have to be actively maintained but it can be done accurately.

An http API would be language agnostic, meaning you could programmatically (with code) get data to it from with whatever language or tool you're using.

It would be a game changer for anyone who needs the data for any real analysis because it would provide them with structured data that would be available readily programmatically.

If you still maintain the project let me know. It wouldn't take too much effort on my end


My data is scraped nighty from ESPN, so it parses pretty much everything you'd find in a game recap (player/team boxscores, play-by-play, shot charts) and puts it into the database. Basketball reference is great for ad-hoc types of studies but a pain in the butt for real projects, especially anything ongoing - querying from database or pulling from an API is much easier and more flexible.

I'm still maintaining the project - will shoot you a DM


I tried replying to you -- it says "Some users couldn’t be added as they have disabled private message receipt."

Seems like I can't reply to your DMs! Either way, I'm swamped at work and will let you know when I have a couple days to implement the API. I like the project.

If you don't want to enable DMs I can reply here.
“If I told you that a flower bloomed in a dark room, would you trust it?”
Saints14
Assistant Coach
Posts: 3,771
And1: 5,456
Joined: Jul 19, 2013
 

Re: PostgreSQL database for NBA stats 

Post#9 » by Saints14 » Mon Aug 24, 2020 2:28 pm

fianchetto wrote:
Saints14 wrote:
fianchetto wrote:
Where do you get your data? Because a program can be written (relatively) easily to gather data from sites like basketball-reference.com and nba.com. It would have to be actively maintained but it can be done accurately.

An http API would be language agnostic, meaning you could programmatically (with code) get data to it from with whatever language or tool you're using.

It would be a game changer for anyone who needs the data for any real analysis because it would provide them with structured data that would be available readily programmatically.

If you still maintain the project let me know. It wouldn't take too much effort on my end


My data is scraped nighty from ESPN, so it parses pretty much everything you'd find in a game recap (player/team boxscores, play-by-play, shot charts) and puts it into the database. Basketball reference is great for ad-hoc types of studies but a pain in the butt for real projects, especially anything ongoing - querying from database or pulling from an API is much easier and more flexible.

I'm still maintaining the project - will shoot you a DM


I tried replying to you -- it says "Some users couldn’t be added as they have disabled private message receipt."

Seems like I can't reply to your DMs! Either way, I'm swamped at work and will let you know when I have a couple days to implement the API. I like the project.

If you don't want to enable DMs I can reply here.


Huh weird, I guess I had that setting turned off somehow. It should work now if you want to give it another shot, otherwise I'm happy to correspond here

Return to Statistical Analysis