Customer-Segmenation

Business Problem

The credit card business of a large conglomerate is interested in acquiring new customers from its movie customer base. The goal is to identify and target profitable customers from the movie business who have potential to purchase the credit cards.

Data

The raw data contains transactional information of VOX customers over a period of 1 year across 26 cinema locations.

Data Cleaning:

Removed duplicate records
Dropped columns with >90% missing values
Imputed missing values using KNN imputer
Formatted data types

After cleaning, the final dataset contains 22,488 customers and 53 variables related to their movie watching behavior, spending patterns, preferred movie types, booking channels etc.

Analytics Approach

The project follows a standard data science lifecycle:

Data Understanding:

Created a data dictionary to document all variables
Analyzed distribution of variables, identified outliers
Calculated summary statistics on the data

Feature Engineering:

Removed highly correlated variables using correlation matrix and VIF
Imputed missing values using KNN imputer
Generated new features like average spend per visit and customer tenure

Model Building:

Compared performance of Logistic Regression, KNN, Random Forest models
Tuned hyperparameters using RandomizedSearchCV
Evaluated models using AUC, recall, precision metrics

Model Evaluation:

Logistic Regression and KNN had lower AUC scores around 54-55%
Random Forest performed best with AUC of 58% and recall of 78%
Selected Random Forest as the final model based on business requirement of maximizing recall.

Notes

In this project the following were accomplished -

Contextualized problem statement definition can be found in
```
  - Customer_Segmentation_VOX.pdf
```
Understood the dataset
```
  - Dataset Summary.xlsx
```
Generated data dictionary after cleaning the dataset
```
  - Data Dictionary - After Cleaning .xlsx
```
Peformed Factor Mapping and Hypothesis testing
```
  - Factor Mapping and Hypothesis.xlsx
```
Consecutively performed EDA including Univariate and Bivariate analysis
```
   - Bivariate Analysis.xlsx
```

Conducted feature extraction

   - Feature_extraction_bivariates.xlsx

Checked correlation between variables
```
   - Correlation Matrix.xlsx
```
Picked out highly correlated variables
```
   - Highly_corr_cols.xlsx
```
Checked viability of variables and selected the appropriate features
```
   - VIF_Data.xlsx & VIF_Iterations.xlsx
```
Fit different models
```
  - Modelling Iterations.xlsx
```
Code used for this project
```
  - Segmentation_Code_Snippet.ipynb
```

The capstone project included presentations of different stages of the solution which were divided into three reviews :

  - Review 1 - Capstone Project - NAJM - Final Deck.pdf
  - Review 2 - Capstone Project - NAJM - Final Deck.pdf
  - Review 3 - Capstone Project - NAJM - Final Deck.pdf

Wrote a paper and submitted to the university for final year practicum (graded)
```
  - Customer_Segmentation_PAPER.pdf 
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Customer-Segmenation

Business Problem

Data

Analytics Approach

Notes

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Bivariate Analysis.xlsx		Bivariate Analysis.xlsx
Correlation Matrix.xlsx		Correlation Matrix.xlsx
Customer_Segmentation-VOX.pdf		Customer_Segmentation-VOX.pdf
Customer_Segmentation_PAPER.pdf		Customer_Segmentation_PAPER.pdf
Data Dictionary - After Cleaning.xlsx		Data Dictionary - After Cleaning.xlsx
Data Dictionary - Before Cleaning.xlsx		Data Dictionary - Before Cleaning.xlsx
Dataset Summary.xlsx		Dataset Summary.xlsx
Factor Mapping and Hypothesis .xlsx		Factor Mapping and Hypothesis .xlsx
Feature_extraction_bivariates.xlsx		Feature_extraction_bivariates.xlsx
Highly_corr_cols.xlsx		Highly_corr_cols.xlsx
Modelling Iterations - Report.xlsx		Modelling Iterations - Report.xlsx
README.md		README.md
Review 1 - Capstone Project - NAJM - Final Deck.pdf		Review 1 - Capstone Project - NAJM - Final Deck.pdf
Review 2 - Capstone Project - NAJM - Final Deck.pdf		Review 2 - Capstone Project - NAJM - Final Deck.pdf
Review 3 - Capstone Project - NAJM - Final Deck.pdf		Review 3 - Capstone Project - NAJM - Final Deck.pdf
Segmentation_Code_Snippet.ipynb		Segmentation_Code_Snippet.ipynb
VIF_Data.xlsx		VIF_Data.xlsx
VIF_Iterations.xlsx		VIF_Iterations.xlsx

atulp4/Customer-Segmenation

Folders and files

Latest commit

History

Repository files navigation

Customer-Segmenation

Business Problem

Data

Analytics Approach

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages