
Building a Flask App to Scrape NBA Data
- 9 minsAs a basketball fan and Python enthusiast, I recently embarked on a project to build a Flask web application that scrapes NBA data using the RapidAPI NBA API. The goal was to create a simple yet functional app that displays NBA team information and head to head matchup data. Along the way, I remembered a few things about API’s in python, and the ease of building projects with Flask.
The Project: A Flask App for NBA Data
The Flask app is built as an API interface to display NBA data fetched from the RapidAPI NBA API. The main page features a Matrix-themed design with Neo’s ASCII art and links to various API endpoints, such as /api/teams, /api/seasons, /api/leagues, /api/games, and /api/standings. The /api/test endpoint returns a static list of NBA teams with their IDs and names, while the other endpoints dynamically fetch data from the RapidAPI service. The project was a great way to combine my love for basketball, web development, and Python’s simplicity.
In this blog, I’ll share my experience and insights on these key aspects. My intent is to share these things for people who arent in the tech feild and have them digest the information easily so forgive me for not going to deep into the weeds of this project.
Using a .env File for Secure Configuration
One of the first challenges I had was how to securely store my RapidAPI key and host information. Hardcoding API keys directly in the source code is a security risk, especially if the code is shared publicly on GitHub or any public version control system. I decided to use a .env file to store my API credentials and loaded them into the Flask app using the python-dotenv library. I typically stay away from adding complexity to code but a minimal amount of libraries is fine in this case.
I created a .env file in the project root to look like this:
RAPIDAPI_KEY=your_rapidapi_key_here
RAPIDAPI_HOST=api-nba-v1.p.rapidapi.com
In the Flask app.py, I imported load_dotenv from python-dotenv and os to access these variables:
from dotenv import load_dotenv
import os
load_dotenv()
headers = {
'X-RapidAPI-Key': os.getenv('RAPIDAPI_KEY'),
'X-RapidAPI-Host': os.getenv('RAPIDAPI_HOST')
}
This approach allows to me keep sensitive data outside of the app. The load_dotenv() function reads the .env file and makes the variables available via os.getenv(), allowing the app to access the API key and host securely.
Hiding the .env File with .gitignore
Since the .env file contains sensitive information, it’s important to prevent it from being committed to version control. I added the following line to my .gitignore file:
.env
This ensures that git ignores the .env file, keeping my API key safe from accidental exposure on platforms like GitHub. Without this step, anyone with access to the repo could see the API key and leverage it to do hundreds of pulls leading to unauthorized access or exceeding API usage limits (and leaving me with a big bill to pay at the end of the month).
Using a .env file with .gitignore is a best practice for any project involving sensitive data. It’s a simple yet effective way to maintain security while keeping the codebase clean and shareable.
Using APIs and Scraping NBA Data
APIs are like gateways to a treasure trove of data, and the NBA API provided a wealth of information about teams, games, standings, and seasons. Trying to explain that to my wife makes it seem like im in the matrix to her but the ability to fetch real-time or historical NBA data with a simple HTTP request felt like unlocking a superpower.
API endpoints
The API endpoints I used included:
- /teams: Fetches a list of NBA teams with details like team IDs, names, and more.
- /seasons: Grab the available NBA seasons.
- /leagues: Lists available leagues (e.g., standard NBA, G-League).
- /games?date=2024-12-25: Gets game data for a specific date (e.g., Christmas Day games).
- /standings?league=standard&season=2024: Returns the standings for the 2024 NBA season.
Each endpoint was accessed using Python’s http.client module, with the response parsed as JSON and returned via Flask’s jsonify function. For example, the /api/teams endpoint was implemented as follows:
@app.route("/api/teams", methods=["GET"])
def get_teams():
conn = http.client.HTTPSConnection(os.getenv('RAPIDAPI_HOST'))
conn.request("GET", "/teams", headers=headers)
res = conn.getresponse()
data = res.read()
conn.close()
return jsonify(json.loads(data.decode("utf-8")))
The thrill of seeing real NBA data flow into my app was addictive. I could fetch team rosters, check game schedules, or analyze standings with just a few lines of code. The API’s structured data made it easy to integrate into the Flask app, and the Matrix-themed front end added a fun, cinematic flair to the project because of my wifes thoughts on the matrix.
Why APIs Are Exciting
APIs make it possible to access vast amounts of data without needing to scrape websites manually or build complex data pipelines. The RapidAPI platform simplified the process further by providing an interface for the NBA API, complete with documentation and usage limits. The ability to query specific endpoints (e.g., games on a particular date) allow me to tweak the app to my interests, like focusing on head to head matchups between two teams.
For basketball fans, working with NBA data feels like stepping into the game itself. Whether it’s checking the latest standings or revisiting past seasons, APIs bring the data to life in a way that’s both fun and practical. This is the modern way the internet works and now I see why some people find this so fun.
Using Python
Python’s simplicity and versatility were key to making this project quick and painless. From setting up the Flask web server to handling API requests python’s ecosystem made every step intuitive and efficient. This is probably why python is #1 on the TIOBE INDEX.
Flask: A Lightweight Web Framework
Flask is a lightweight and flexible web framework that allowed me to set up routes and serve content with minimal boilerplate. The main page, with its Neo ASCII art and links to API endpoints, was created using Flask’s render_template_string function:
@app.route('/')
def index():
return render_template_string(template, neo_art=neo_art)
This simplicity let me focus on the core functionality—fetching and displaying NBA data—without getting bogged down in complex configuration. The matrix like ASCII art and the overall homepage are just nice-to-have’s and not necessary for the actual endpoints.
Python’s HTTP and JSON Handling
Python’s built-in http.client module made it easy to send HTTP requests to the RapidAPI NBA API. Combined with the json module, parsing the API responses was straightforward:
import http.client
import json
conn = http.client.HTTPSConnection(os.getenv('RAPIDAPI_HOST'))
conn.request("GET", "/standings?league=standard&season=2024", headers=headers)
res = conn.getresponse()
data = res.read()
conn.close()
return jsonify(json.loads(data.decode("utf-8")))
This code snippet shows how Python handles HTTP requests and JSON parsing with minimal effort, making it ideal for rapid prototyping and development.
The Power of Python’s Ecosystem
The python-dotenv library for managing .env files is a great example of Python’s functionality and flexibility. With just a few lines of code, I could securely load environment variables, keeping my API credentials safe from web crawlers and potential hackers. Python’s package manager, pip, also made it a little easier to manage dependencies and document them in requirements.txt:
Flask==3.0.3
python-dotenv==1.0.1
This file ensures that anyone else working on the project can install the exact dependencies with a single command: pip install -r requirements.txt.
Python’s readability and extensive libraries allows you to build the app quickly while maintaining clean, maintainable code. I learned that back in 2018 when first learning how all these apps connected to each other using a variable that didnt seem defined anywhere. With the experience I have gained over the years I find that to be common place now, but it blew my mind back in the day.
Lessons Learned and Next Steps
This project was a fun refresher course that combined by python skills with my passion for basketball.
As far as my next steps, I plan to enhance the app by:
- Adding dynamic query parameters for the /api/games endpoint to fetch games for any date.
- Work with a friend to build out a frontend to choose a dropdown list of two teams to compare head to head stats with.
- Improve the front end with interactive elements, like tables to display standings or team logos.
- Implementing error handling for API requests to gracefully handle rate limits or invalid responses.
Conclusion
There were a few peices of inspiration behind building the app but a big part of it was building out a project with a friend and to solve the issue I was having with finding head to head data. Currently to find out if the Miami Heat have beant the Minnesota Timberwolves over the last 5 games they have played you have to go to nba.com or espn.com and navigate through pages upon pages of data and information. It can get a little tricky. Finishing this app will allow us to pull that data with an api and get our answers immediately.
To check out the Matchups App we are building you can check here. However, if you want to grab just the backend portion and play around with API’s and Python you can check out my git repo here