This project is an ETL (Extract, Transform, Load) pipeline designed to collect and process data related to top European football matches, including leagues like the English Premier League (EPL) and Serie A, La Liga,… This project culminates in a web application that allows users to search for and rank these matches.
Dataset is collected from Kaggle
The primary objective of this project is to build an ETL pipeline that extracts data from various sources, transforms it into a structured format, and loads it into a database. This data is then used to power a web application that provides users with search capabilities and rankings of top football matches in Europe.
- Docker desktop
- Dbeaver or any other DB client
- If using Windowns, install Linux on Windowns with WLS
Get clone repo: get clone https://github.com/trungbac11/football-etl-pipeline.git
#create docker
make build
make up
#copy data from local to docker
docker cp football/ de_mysql:/tmp/
#enable access
make to_mysql_root
SHOW GLOBAL VARIABLES LIKE 'LOCAL_INFILE';
SET GLOBAL LOCAL_INFILE=TRUE;
exit
#create tables with schema
make mysql_create
#load csv into created tables
make mysql_load
#create tables with schema
make psql_create