Financial Data Engine

This project shows how to deploy a distributed web scraper for financial data to enhance efficiency, use a relational database for storage, and implement comprehensive monitoring.

Key Features

Distributed Systems: Develop systems using RabbitMQ and Celery for scalable web scraping.
Docker Deployment: Use Docker for streamlined setup and deployment, monitored with Protainer.
Database: Efficiently store and manage data using MySQL.
Monitoring: Implement Grafana, Prometheus for big data monitoring.
Dashboard: Build Grafana dashboards for data status monitoring and anomaly detection.

Quickstart

Follow these steps to set up and run the distributed web scraper:

1. Initial set-up

Clone the repo:

git clone https://github.com/whchien/financial-data-engine.git

Install the necessary dependencies:

make install-package

Initiate docker swarm

make init-swarm

Create the Docker network for service communication:

make create-network

2. Start Essential Services

Deploy RabbitMQ to handle message queuing:

make deploy-rabbitmq

Deploy the MySQL service for data storage:

make deploy-mysql

Set up the MySQL volume for data persistence:

make create-mysql-volume

3. Start Celery Workers

Deploy the Celery worker for TWSE tasks for example:

make run-worker-twse

4. Fetch Financial Data

Send a task to fetch Taiwan futures daily data:

make send-taiwan-futures-daily-task

By following these steps, you will set up a distributed scraping system capable of efficiently collecting financial data, utilizing RabbitMQ for task queuing, MySQL for data storage, and Celery for task execution.

Credits

This project is inspired by this repo.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
fin_engine		fin_engine
grafana		grafana
Dockerfile		Dockerfile
Makefile		Makefile
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
celery.sql		celery.sql
crawler_scheduler.yml		crawler_scheduler.yml
crawler_worker.yml		crawler_worker.yml
create_partition_table.sql		create_partition_table.sql
genenv.py		genenv.py
grafana.yml		grafana.yml
local.ini		local.ini
monitor.sql		monitor.sql
mysql.yml		mysql.yml
portainer.yml		portainer.yml
pyproject.toml		pyproject.toml
rabbitmq.yml		rabbitmq.yml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Financial Data Engine

Key Features

Quickstart

1. Initial set-up

2. Start Essential Services

3. Start Celery Workers

4. Fetch Financial Data

Credits

About

Releases

Packages

Languages

whchien/financial-data-engine

Folders and files

Latest commit

History

Repository files navigation

Financial Data Engine

Key Features

Quickstart

1. Initial set-up

2. Start Essential Services

3. Start Celery Workers

4. Fetch Financial Data

Credits

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages