The repository presented steps for building a model that predicted whether a customer would switch telecommunication service providers. The algorithms tested were Naive Bayes (NB), Decision Tree (DT), K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). Two sets of data were provided: training data to train the model and test data to evaluate the model. The training data comprised 4250 rows with 20 columns. Out of a total of 4,250 samples, 3,652 (85.93%) belonged to the churn=no class, while 598 (14.07%) belonged to the churn=yes class. The test data comprised 750 rows with 20 columns, including the index of each sample and 19 features (excluding the target variable 'churn').
The training data was used to train the model.
Features | Data type | Description |
---|---|---|
state | object | Independent attribute |
account_length | int64 | Independent attribute |
area_code | object | Independent attribute |
international_plan | object | Independent attribute |
voice_mail_plan | object | Independent attribute |
number_vmail_messages | int64 | Independent attribute |
total_day_minutes | float64 | Independent attribute |
total_day_calls | int64 | Independent attribute |
total_day_charge | float64 | Independent attribute |
total_eve_minutes | float64 | Independent attribute |
total_eve_calls | int64 | Independent attribute |
total_eve_charge | float64 | Independent attribute |
total_night_minutes | float64 | Independent attribute |
total_night_calls | int64 | Independent attribute |
total_night_charge | float64 | Independent attribute |
total_intl_minutes | float64 | Independent attribute |
total_intl_calls | int64 | Independent attribute |
total_intl_charge | float64 | Independent attribute |
nnumber_customer_service_calls | int64 | Independent attribute |
churn | object | Dependent attribute |
The sample data was used to make predictions.
Features | Data type | Description |
---|---|---|
id | int64 | - |
state | object | Independent attribute |
account_length | int64 | Independent attribute |
area_code | object | Independent attribute |
international_plan | object | Independent attribute |
voice_mail_plan | object | Independent attribute |
number_vmail_messages | int64 | Independent attribute |
total_day_minutes | float64 | Independent attribute |
total_day_calls | int64 | Independent attribute |
total_day_charge | float64 | Independent attribute |
total_eve_minutes | float64 | Independent attribute |
total_eve_calls | int64 | Independent attribute |
total_eve_charge | float64 | Independent attribute |
total_night_minutes | float64 | Independent attribute |
total_night_calls | int64 | Independent attribute |
total_night_charge | float64 | Independent attribute |
total_intl_minutes | float64 | Independent attribute |
total_intl_calls | int64 | Independent attribute |
total_intl_charge | float64 | Independent attribute |
nnumber_customer_service_calls | int64 | Independent attribute |
- The method with the best performance was obtained.
- Predictions were made on the testing data.
- Kostas Diamantaras. (2020). Customer Churn Prediction 2020. Kaggle. https://kaggle.com/competitions/customer-churn-prediction-2020