How to get excellent football data for free with API football

Jordi Lucas
Geek Culture
Published in
5 min readJul 30, 2022

--

Photo by Mitch Rosen on Unsplash

Do you like to play around with real, reliable football stats about your favourite football players, clubs or leagues? Do you want to get more statistics about football players in your Football Manager Championship and impress your colleagues? Or would you like to create your own football dataset through an API and get real data from a reliable source?

Maybe you are a football fan who likes to compare the accurate passes from players in the Eredivisie (Netherlands) with Turkish SuperLeague players. Or you might be a passionate scout who needs to find young midfielders with the best shooting percentage in the Danish SuperLeague.

Whatever the reason is you need to collect football data, in this article you’ll learn how to quickly collect excellent football data, for free.

Where can I find reliable football data for free?

Photo by Mika Baumeister on Unsplash

FBref, FiveThirtyEight, WhoScored, or Club ELO are some of the websites where you can get reliable football data for free.

You can scrape data from these websites or use an API that offers that data. API Football project is an excellent tool that helps you to quickly create a solid football dataset. API Football is based on a freemium business model. This means you can enjoy it for free with a basic plan and if you’d need more features later, you can always change and upgrade with a paid plan.

What technology I use to consume the API?

Photo by Hitesh Choudhary on Unsplash

Python is definitely my favourite programming language right now. In the last years I have used Python for almost all my projects because of its power and versatility. With pandas, requests and json packages you’ll have most of the necessary tools to develop a simple script that collects data from the API.

Initial set up

First of all, an API Key is mandatory to authenticate you as the application developer. To get one, first you have to sign up to the RAPID API platform. After that, subscribe to the Basic Plan of API Football data application. At this moment, the Basic Plan gives you 30 requests per minute and a maximum of 100 requests a day. You’ll be able to get every day at least two leagues dataset full of valuable information, for free.

Once you have the API Key, it’s possible to connect with the platform. Players endpoint will respond us with all the data players about the selected league. I recommend you to check the documentation to find different options.

Players endpoint (see api football documentation)
Getting an API Football response

Getting data from your favourite league

Photo by Tobias on Unsplash

Once you are authenticated, you’ll need to set a league identificator as a query parameter to get players from that championship. You’ll find some examples of identificators in my GitHub code.

You need to apply some feature engineering techniques to transform a couple of columns: Weight and Height. I have extracted with regular expressions the float value of their suffix: kg and cm in order to do operations with them.

Using regex with str.extract() function
Important to control request per minute and sleep process in BASIC plan.

Creating the dataset

Photo by Mike Petrucci on Unsplash

When you get data from API football (and from almost every REST API), you have to parse it from JSON format. Something like this:

JSON format response example

Our dataset has 38 columns. The goal is to get all this data from the API football players endpoint response in a JSON format. Therefore you have to parse the JSON document with python json built-in package.

In this case, I use a for loop with n results returned in the response object, where each one is a player record. In that piece of code we can see some of the data’s players.

Getting data from JSON response.

Once all the data is in the memory, we need to save it in csv format to prepare the analysis. You could create a relational database and store it there, but in this case csv files help me to do EDA later in google colab or Jupyter Notebooks and import the datasets easily.

You’ll find the three steps to do all in the get_api_data function.

Get an API Key, get the data and save it.

Conclusions

Now you know how to get real and updated football data from API Football platform, transform some of their attributes and create a dataset with all the players of your favourite league.

Bundesliga dataset

From now on, you can do a lot with all this data: Exploratory Data Analisys, create supervised and unsupervised machine learning models (linear regression, classification, etc….) or another API with more in deep stats.

To view and clone the updated code, go to api-football-data project in my GitHub repository. I’d really appreciate your clap if you found the article interesting.

*Thanks to Bianca Hofman for editing my story before publishing.

--

--

Jordi Lucas
Geek Culture

Data Science, Sports Analytics, Machine Learning, Deep Learning, AI, Python, Spark.