Cohort is a group of people who share a common characteristic over a
certain period of time. Cohort Analysis is a study that focuses on activities of a
particular cohort. It allows us to identify relationship between characteristics of
a population and that population’s behaviour which can used in varying fields of
medicine, retail, ecommerce. It is a subset of behavioural analytics that takes the
data and rather than looking at all users as one unit, it breaks them into related
groups for analysis.
In our system, we
calculate the retention rate of customers where retention rate is the (No. of
customers - No. of customers who cancelled) of previous month / No. of
customers in the previous month. We use cohort analysis to observe what
happens to a group of customers that a join a particular time period say a
January 2015 cohort, February 2015 cohort etc.
The transactional data (purchase records) must me cleaned and grouped into cohorts. Pandas is used for this purpose.
MatplotLib is used to visualize the data, in the exploratory analysis process
In a Business scenario, communicating the insights gained from data, is as important as gaining the insights itself. We have a developed an interactive web app, which visualizes the data product-wise for the required timeline.
Bokeh python package provides a good abstraction, which enables us to build data driven web-app with ease.
use the following command from root-dir to launch the web-app
``` bokeh serve --show cohort ```green - 100% customer retention, red - 0% customer retention