This assignment uses data from Kaggle's Titanic competition. titanic.csv
is in the repo, so there is no need to download the data from the Kaggle website.
Tasks:
- Read
titanic.csv
into a DataFrame. - Define Pclass and Parch as the features, and Survived as the response.
- Split the data into training and testing sets.
- Fit a logistic regression model and examine the coefficients to confirm that they make intuitive sense.
- Make predictions on the testing set and calculate the accuracy.
- Bonus: Compare your testing accuracy to the "null accuracy", a term we've seen once before.
- Bonus: Add Age as a feature, and calculate the testing accuracy. There will be a small issue you'll have to deal with.