In this project, I'm going to explore a dataset with information from varios space mission launches between
- Objectives
- Setup
- Download dataset from Kaggle
- Preparation
- EDA: exploratory data analysis
- Conclusions
The main objective is can answer the following questions:
- Who launched the most missions in any given year?
- How has the cost of a space mission varied over time?
- Which months are the most popular for launches?
- Have space missions gotten safer or has the chance of failure remained unchanged?
I use the following libraries:
- Kaggle for download the dataset from Kaggle
- zipfile for unzip the dataset
- Pandas for managing the data
- Seaborn for visualizing data
- Matplotlib for additional plotting tools
- NumPy for mathematical operations
- SQLAlchemy
- Pandasql
I. Install libraries
!pip install kaggle
!pip install zipfile
!pip install pandas
!pip install seaborn
!pip install matplotlib
!pip install numpy
!pip instal sqlalchemy
!pip install pandasql
I. In order to download the mission data set i going to use the Kaggle's API. You can download it from this link.
!kaggle datasets download -d sefercanapaydn/mission-launches
II. Unzip dataset
zipfile_name = 'mission-launches.zip'
with zipfile.ZipFile(zipfile_name, 'r') as file:
file.extractall()
The unzip file is named mission_launches.csv
III. Load data into a dataframe
mission = pd.read_csv('mission_launches.csv')
display(mission.head())
print("Rows x Columns: ", mission.shape)
- Analysis of the data types of each column and transformation to the appropriate type.
- Count columns with null values
- Drop columns with unuseful values
- Statistics on numerical columns and string columns
- Count duplicated values and unique values
- Create a new column: Year
- Who launched the most missions in any given year?
Year | Organization | Launches |
---|---|---|
1976 | RVSN USSR | 93 |
1977 | RVSN USSR | 92 |
1971 | RVSN USSR | 90 |
- How has the cost of a space mission varied over time?
-
Have space missions gotten safer or has the chance of failure remained unchanged?
See attached document: Conclusions_space_missions