Binary metrics (#46)

* added all metrics logging and examples, added helper for getting pickled artifact
neptune-ai · Aug 17, 2019 · ea606d7 · ea606d7
1 parent 050e5c8
commit ea606d7
Show file tree

Hide file tree

Showing 12 changed files with 1,004 additions and 630 deletions.
diff --git a/docs/_static/images/binary_metrics.gif b/docs/_static/images/binary_metrics.gif
diff --git a/docs/conf.py b/docs/conf.py
@@ -44,9 +44,9 @@
 author = 'Neptune Dev Team'
 
 # The short X.Y version
-version = '0.11'
+version = '0.12'
 # The full version, including alpha/beta/rc tags
-release = '0.11.0'
+release = '0.12.0'
 
 # -- General configuration ---------------------------------------------------
 

diff --git a/docs/examples/examples_index.rst b/docs/examples/examples_index.rst
@@ -4,7 +4,7 @@
    Code snapshoting <code_snapshots>
    Image directory snapshoting <image_dir_snapshots>
    Hyper parameter comparison <explore_hyperparams_skopt>
-   Log model diagnostics <log_model_diagnostics>
+   Log binary classification metrics <log_binary_metrics>
    Integrate with Sacred <observer_sacred>
    Monitor lightGBM training <monitor_lgbm>
    Monitor fast.ai training <monitor_fastai>

diff --git a/docs/examples/log_binary_metrics.ipynb b/docs/examples/log_binary_metrics.ipynb
@@ -0,0 +1,165 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Log binary classification metrics to Neptune\n",
+    "## Train your model and run predictions\n",
+    "Let's train a model on a synthetic problem predict on test data."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from sklearn.datasets import make_classification\n",
+    "from sklearn.ensemble import RandomForestClassifier\n",
+    "from sklearn.model_selection import train_test_split\n",
+    "from sklearn.metrics import classification_report\n",
+    "\n",
+    "X, y = make_classification(n_samples=2000)\n",
+    "\n",
+    "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)\n",
+    "\n",
+    "model = RandomForestClassifier()\n",
+    "model.fit(X_train, y_train)\n",
+    "\n",
+    "y_test_pred = model.predict_proba(X_test)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Instantiate Neptune"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import neptune\n",
+    "\n",
+    "\n",
+    "neptune.init(project_qualified_name='USER_NAME/PROJECT_NAME')\n",
+    "neptune.create_experiment()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Send all binary classification metrics to Neptune\n",
+    "\n",
+    "With just one function call you can log a lot of information.\n",
+    "\n",
+    "### Class-based metrics:\n",
+    "\n",
+    "- accuracy\n",
+    "- precision, recall\n",
+    "- f1_score, f2_score\n",
+    "- matthews_corrcoef\n",
+    "- cohen_kappa\n",
+    "- true_positive_rate, true_negative_rate\n",
+    "- false_positive_rate, false_negative_rate\n",
+    "- positive_predictive_value, negative_predictive_value, false_discovery_rate\n",
+    "   \n",
+    "### Threshold-based charts for all class-based metrics\n",
+    "\n",
+    "### Performance charts:\n",
+    "\n",
+    "- Confusion Matrics\n",
+    "- Classification Report Table\n",
+    "- ROC AUC\n",
+    "- Precision Recall curve\n",
+    "- Lift curve\n",
+    "- Cumulative gain chart\n",
+    "- Kolmogorov-Smirnov statistic chart\n",
+    "   \n",
+    "### Losses:\n",
+    "\n",
+    "- log loss\n",
+    "- brier loss\n",
+    "   \n",
+    "### Other metrics:\n",
+    "\n",
+    "- ROC AUC score\n",
+    "- Average precision \n",
+    "- KS-statistic score"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from neptunecontrib.monitoring.metrics import log_binary_classification_metrics\n",
+    "\n",
+    "log_binary_classification_metrics(y_test, y_test_pred, threshold=0.5)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "It is now safely logged in Neptune.\n",
+    "Check out [this experiment](https://ui.neptune.ml/o/neptune-ml/org/binary-classification-metrics/e/BIN-101/logs). \n",
+    "\n",
+    "![binary classification metrics](../_static/images/binary_metrics.gif)\n",
+    "\n",
+    "## Log things separately\n",
+    "\n",
+    "You can also choose what to log and do it separately."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from neptunecontrib.monitoring.metrics import *\n",
+    "\n",
+    "log_confusion_matrix(y_test, y_test_pred[:, 1] > threshold)\n",
+    "log_classification_report(y_test, y_test_pred[:, 1] > threshold)\n",
+    "log_class_metrics(y_test, y_test_pred[:, 1] > threshold)\n",
+    "log_class_metrics_by_threshold(y_test, y_test_pred[:, 1])\n",
+    "log_roc_auc(y_test, y_test_pred)\n",
+    "log_precision_recall_auc(y_test, y_test_pred)\n",
+    "log_brier_loss(y_test, y_test_pred[:, 1])\n",
+    "log_log_loss(y_test, y_test_pred)\n",
+    "log_ks_statistic(y_test, y_test_pred)\n",
+    "log_cumulative_gain(y_test, y_test_pred)\n",
+    "log_lift_curve(y_test, y_test_pred)\n",
+    "log_prediction_distribution(y_test, y_test_pred[:, 1])"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "blog_metrics",
+   "language": "python",
+   "name": "blog_metrics"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.8"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}