Skip to content

Amar-AIcloud/Airline_Data_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Airline_Data_Analysis

Airline Dataset Analysis using PySpark.

Analyze flight performance data and determine the ranking airports with Rank. In this, we will be utilizing departure delay data to perform analysis and answer the following questions:

Determine the number of airports and trips Determining the longest delay in this dataset Determining the number of delayed vs. on-time / early flights Which flights departing SFO are most likely to have significant delays Which destinations tend to have delays Which destinations tend to have significant delays departing from SEA Airport Ranking using Rank

Data File: 2015 Flight Delays and Cancellations :- https://www.kaggle.com/usdot/flight-delays

Dataset Description:

The dataset consists of 1048576 data points, including the following parameters: Flight_Number Destination_Delay Distance Arrival_Delay

About

Airline Dataset Analysis using PySpark.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published