VERY EARLY STAGE Variable Importance Plot library
A very early stage, highly incomplete, proof-of-concept Python variable importance plotting library using single dispatch to mimic R's VIP library.
It's easy!
import numpy as np import matplotlib.pyplot as plt from sklearn import ensemble from sklearn import datasets from sklearn.utils import shuffle from sklearn.metrics import mean_squared_error import bigwig boston = datasets.load_boston() X, y = shuffle(boston.data, boston.target, random_state=13) X = X.astype(np.float32) offset = int(X.shape[0] * 0.9) X_train, y_train = X[:offset], y[:offset] X_test, y_test = X[offset:], y[offset:] params = {'n_estimators': 500, 'max_depth': 4, 'min_samples_split': 2, 'learning_rate': 0.01, 'loss': 'ls'} clf = ensemble.GradientBoostingRegressor(**params) clf.fit(X_train, y_train) # defaults: bigwig.vip(clf, boston.feature_names, relative=True, num_features=10, bar=True, horizontal=True, color="lightgrey", fill="lightgrey")
# Vertical: bigwig.vip(clf, boston.feature_names, horizontal=False)
Currently only with GB and RF regressions from Scikit Learn.
This project has been set up using PyScaffold 3.1. For details and usage information on PyScaffold see https://pyscaffold.org/.