Skip to content

An end to end data pipeline project to fetch Olympic data and to get insights from it.

Notifications You must be signed in to change notification settings

rashmi0007/Olympic_Data_engineering_project

Repository files navigation

Olympic_Data_engineering_project

End_to_End_Olympic_data_Engineering_usingAzure

About Project:

Using various Azure services, such as Azure Databricks, Azure Synapse Analytics, and Azure Data Factory, the Tokyo Olympic Data Engineering Project is a comprehensive data engineering solution that collects, processes, and analyzes data related to the Tokyo Olympic Games.

Data: https://github.com/rashmi0007/Olympic_Data_engineering_project/tree/main/Transformed_Olympic_DataSet

Data ingestion code: https://github.com/rashmi0007/Olympic_Data_engineering_project/blob/main/data_ingestion_pipelines_datafactory.JSON

The project uses Azure Data Factory to manage and automate the data integration and workflow processes. It extracts, transforms, and loads (ETL) data from different sources and stores the data in Data Lake. Then, Azure Databricks is used for data processing and transformation tasks. Databricks enables scalable and distributed data processing, allowing for effective data manipulation, cleaning, and aggregation. It also offers a collaborative environment for data engineers and data scientists to work together smoothly.

Azure Synapse Analytics, a powerful analytics service, is used for data warehousing and advanced analytics. It enables the storage and analysis of large volumes of structured and unstructured data. Olympic_Synapse_analytics

After the data is transformed it can be used for visualization and analysis using Tableau or PowerBI.

About

An end to end data pipeline project to fetch Olympic data and to get insights from it.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published