This project is an end to end streaming data pipeline fro an ecommerce data.
The tools used in this projects are
- FastApI
- Apache Kafka
- Apache Spark
- MongoDB
- Streamlit
- Docker
Download the data from https://www.kaggle.com/datasets/carrie1/ecommerce-data
-
Start the docker service Run
docker-compose up
-
Run the jupyterlab notebook
http://localhost:8888/
-
Run the client
python3 client.py