Classic classification by local modeling is a two-step approach for modeling:
- An unsupervised clustering algorithm is run to find regions in the dataset;
- For each region, a model is built with the respective data subset.
For inference the procedure is similar:
- A similarity metric is used to determine the new data point region, e.g. euclidian distance from regions prototypes;
- The model from that specific region is used to predict the class of the new data point.
To install dependencies run pip install -r requirements.txt
on the main directory.
- Pandas
- Numpy
- Sklearn
- Plotly
You can see the notebook here.