ZoomCamp: Data Engineering Course

This repository contains the materials and projects from the ZoomCamp Data Engineering Course, where I learned essential data engineering skills and mastered various tools to manage data pipelines.

Course Overview

The course was divided into 9 weeks, covering a wide range of topics and tools related to data engineering:

Introduction & Prerequisites: Setup of GCP, Docker, and Terraform
Workflow Orchestration: Data Lake, Prefect, and ETL with GCP
Data Warehouse: BigQuery, Partitioning, Clustering, and BigQuery ML
Analytics Engineering: dbt, BigQuery, Postgres, Data Studio, and Metabase
Batch Processing: Apache Spark, DataFrames, and Spark SQL
Streaming: Apache Kafka, Avro, Kafka Streams, Kafka Connect, and KSQL
Project: Apply the learned concepts and tools to a real-world project.

I have developed a real estate data pipeline that sources data from a leading Spanish real estate portal, Idealista. The pipeline is designed to efficiently extract data from the portal, transform it, and load it into Google Cloud Storage (GCS) before ultimately storing it in BigQuery. This entire process is executed daily through an automated workflow orchestrated using Prefect. To further refine and analyze the data, I employed dbt for analytics engineering tasks. The transformed data is then visualized using Looker, enabling users to gain valuable insights into the real estate market trends and dynamics. Check my repo here.

Technologies and Tools

Throughout the course, I gained hands-on experience with the following technologies and tools:

Google Cloud Platform (GCP)
Google Cloud Storage (GCS)
BigQuery
Terraform
Docker
SQL
Prefect
dbt
Apache Spark
Apache Kafka

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
week_1_basics_n_setup		week_1_basics_n_setup
week_2_workflow_orchestration		week_2_workflow_orchestration
week_3_data_warehouse		week_3_data_warehouse
week_4_analytics_engineering		week_4_analytics_engineering
week_5_batch_processing		week_5_batch_processing
week_6_stream_processing		week_6_stream_processing
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ZoomCamp: Data Engineering Course

Course Overview

Technologies and Tools

About

Releases

Packages

Languages

alexquant1993/de-zoomcamp-notes

Folders and files

Latest commit

History

Repository files navigation

ZoomCamp: Data Engineering Course

Course Overview

Technologies and Tools

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages