Predict Customer Churn

Customer Churn prediction is an important factor for business success, and is the focus of this project. The current library covers different steps to succeed on this task, including: pre-process data, training a ML model, predict labels for unknown data, evaluation, and model interpretability.
This project is part of the ML DevOps Engineer Nanodegree (Udacity).

Project Description

In this work it is used a credit card customer dataset from Kaggle (https://www.kaggle.com/datasets/sakshigoyal7/credit-card-customers). The dataset consist of 10000 customers with a set of 21 demographic features, such as age, salary, marital_status, gender, credit limit, etc. This dataset is highly imbalanced, with only 16% of customer churn.

The next table provides a description of the customer variables:

Feature	Description
CLIENTNUM	Unique identifier for the customer holding the account
Attrition_Flag	If the account is closed then Attrited Customer else Existing Customer
Customer_Age	Customer's Age in Years
Gender	M=Male, F=Female
Dependent_count	Number of dependents
Education_Level	Educational qualification of the account holder
Marital_Status	Married, Single, Divorced, Unknown
Income_Category	Annual income category of the account holder
Card_Category	Type of Card (Blue, Silver, Gold, Platinum)
Months_on_book	Period of relationship with bank
Total_Relationship_Count	Total number of products held by the customer
Months_Inactive_12_mon	Number of months inactive in the last 12 months
Contacts_Count_12_mon	Number of contacts in the last 12 months
Credit_Limit	Credit limit on the credit card
Total_Revolving_Bal	Total revolving balance on the credit card
Avg_Open_To_Buy	Open to buy credit line (Average of last 12 months)
Total_Amt_Chng_Q4_Q1	Change in transaction amount (Q4 over Q1)
Total_Trans_Amt	Total transaction amount (Last 12 months)
Total_Trans_Ct	Total transaction count (Last 12 months)
Total_Ct_Chng_Q4_Q1	Change in transaction count (Q4 over Q1)
Avg_Utilization_Ratio	Average card utilization ratio

Note: variable descriptions were taken from https://ceur-ws.org/Vol-3026/paper17.pdf.

The Attrition_Flag tell us if a customer churns or not. In other words, this is the response variable to predict.

The library includes the following steps:
a) Import data
b) Exploratory data analysis
c) Feature engineering
d) Training models (Random forest, and Logistic regression)
e) Evaluation report

Files and data description

The project have the next tree structure:

root/
- churn_library.py
- churn_script_logging_and_tests.py
- constant.py
data/
- bank_data.csv

File	Description
churn_library.py	Main file to run the ML pipeline
churn_script_logging_and_tests.py	Perform a test run and log process for inspection
constant.py	Configuration parameters and hyperparameters
bank_data.csv	Customer churn dataset

After running churn_library.py, new directories are created to save artifacts:

root/
images/
eda/
- <ARTIFACT_NAME>.png
results/
- <ARTIFACT_NAME>.png
models/
- <MODEL_NAME>.pkl

To inspect the process, you can make a test run (optional) with churn_script_logging_and_tests.py, and a log file is created:

root/
logs/
- churn_library.log

Setup

Create a conda environment:

conda create --name <ENV_NAME> python=3.6

Change to conda environment:

conda activate <ENV_NAME>

Move to root folder of this project.
Install requirements:

pip install -r requirements.txt

Running Files

Move to root folder.
Run code from the terminal:

python churn_library.py

As explained in the previous section (Files and data description), new files are created after running step 2.
Note: The config and training parameters can be modify directly in constant.py

Optional: If you want to inspect and log the process, enter:

python churn_script_logging_and_tests.py

The logs can be reviewed in logs/churn_library.log file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predict Customer Churn

Project Description

Files and data description

Setup

Running Files

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
README.md		README.md
churn_library.py		churn_library.py
churn_script_logging_and_tests.py		churn_script_logging_and_tests.py
constant.py		constant.py
requirements.txt		requirements.txt

amesval/customer_churn_prediction

Folders and files

Latest commit

History

Repository files navigation

Predict Customer Churn

Project Description

Files and data description

Setup

Running Files

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages