Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: expand documentation #150

Merged
merged 19 commits into from
Jul 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 12 additions & 7 deletions docs/docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,17 +12,22 @@ This platform provides a comprehensive solution for monitoring and observing you
While models often perform well during development and validation, their effectiveness can degrade over time in production due to various factors like data shifts or concept drift. The Radicalbit AI Monitor platform helps you proactively identify and address potential performance issues.

### Key Functionalities
The platform provides comprehensive monitoring capabilities to ensure optimal performance of your AI models in production. It analyzes both your reference dataset (used for pre-production validation) and the current datasets in use, allowing you to put under control:
* **Data Quality:** evaluate the quality of your data, as high-quality data is crucial for maintaining optimal model performance. The platform analyzes both numerical and categorical features in your dataset to provide insights into
The platform provides comprehensive monitoring capabilities to ensure optimal performance of your AI models in production. It analyses both your reference dataset (used for pre-production validation) and the current datasets in use, allowing you to put under control:
* **Data Quality:** evaluate the quality of your data, as high-quality data is crucial for maintaining optimal model performance. The platform analyses both numerical and categorical features in your dataset to provide insights into
* *data distribution*
* *missing values*
* *target variable distribution* (for supervised learning).

* **Model Quality Monitoring:** the platform provides a comprehensive suite of metrics specifically designed at the moment for binary classification models. These metrics include:
* *target variable distribution* (for supervised learning).

* **Model Quality Monitoring:** the platform provides a comprehensive suite of metrics specifically designed at the moment for classification and regression models. \
For classification these metrics include:
* *Accuracy, Precision, Recall, and F1:* These metrics provide different perspectives on how well your model is classifying positive and negative cases.
* *False/True Negative/Positive Rates and Confusion Matrix:* These offer a detailed breakdown of your model's classification performance, including the number of correctly and incorrectly classified instances.
* *AUC-ROC and PR AUC:* These are performance curves that help visualize your model's ability to discriminate between positive and negative classes.
* **Model Drift Detection:** analyze model drift, which occurs when the underlying data distribution changes over time and can affect model accuracy.

For regression these metrics include:
* *Mean Absolute Error, Mean Squared Error, Root Mean Squared Error, R²:* These metrics provide different perspectives on how well your model is predicting a numerical value.
* *Residual Analysis:* This offers a detailed breakdown of your model's performance, comparing predictions with ground truth and predictions with residuals, i.e. the difference between predictions and ground truth.
* **Model Drift Detection:** analyse model drift, which occurs when the underlying data distribution changes over time and can affect model performance.

### Current Scope and Future Plans
This initial version focuses on binary classification models. Support for additional model types is planned for future releases.
This version focuses on classification, both binary and multiclass, and regression models. Support for additional model types is planned for future releases.
121 changes: 59 additions & 62 deletions docs/docs/user-guide/quickstart.md → docs/docs/l1_quickstart.md

Large diffs are not rendered by default.

8 changes: 8 additions & 0 deletions docs/docs/l2_user_guide/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "User Guide",
"position": 1,
"link": {
"type": "generated-index",
"description": "Welcome to the «radicalbit-ai-monitoring» user guide. This document is designed to help you get started with our platform, understand its core concepts and make the most out of its features. Whether you are a new user or an experienced professional, this guide will provide you with the necessary information to effectively monitor and manage your AI systems."
}
}
43 changes: 43 additions & 0 deletions docs/docs/l2_user_guide/how_to.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
---
sidebar_position: 3
---

# Hands-On Guide

In ths guide we are focusing on how to use the GUI.

If you prefer to do everything using our SDK please refer to our [SDK Quickstarts](https://github.com/radicalbit/radicalbit-ai-monitoring/tree/main/docs/quickstarts).

## How to create a model

* From the Model main page ![Alt text](/img/how_to/new_model_step1.png "New model step 1")
click on the plus sign in the top right corner.

* Fill up the name of the model, the model type (at the moment only `Binary Classification` `Multiclass Classification` and `Regression` are available), the time granularity on which aggregations are computed (`Hour`, `Day`, `Week` or `Month`), and eventually (optional fields) Framework (e.g. scikit-learn) and Algorithm (e.g. KNeighborsClassifier), and then click `Next`
![Alt text](/img/how_to/new_model_step2.png "New model step 2")

* Upload a CSV file containing *all features*, *prediction*, eventually *prediction probability*, *ground truth* (i.e. the correct value for the set of feature that the model should predict), and a *timestamp* (we use it as a UUID for each row). At this stage a very small CSV file, about 10 rows, suffices, since we use it just to create a schema, input and output signatures for the model. If you prefer to upload your whole reference dataset this is perfectly fine, of course.

* Since your CSV file can contain fields you are not interested in monitoring, e.g. some custom UUID, you can choose the fields to carry forward ![Alt text](/img/how_to/new_model_step3.png "New model step 3")

* Next choose your `Target`, `Prediction`, `Timestamp` and, if present, `Prediction Probability` fields.

## Change field type

In your CSV file there might be some numerical variables which are actually categorical: for instance you might have the feature `Gender` which has values `{0,1}`: so we automatically infer it as an integer variables but clearly it makes no sense to compute numerical statistics on it, ince it is clearly the representation of a categorical feature. \
Hence, **as long as no reference dataset has been loaded yet**, in the `Overview` section, `Variables` tab, you are allowed to change the field type of any numerical feature to categorical.
![Alt text](/img/how_to/change_field_type.png "Change field type")

Please note that as soon as a reference dataset is loaded into the platform **this option is no longer available** because we are starting right away computing statistics and metrics on the variables according to their type.

## Load a reference dataset

Go to the `Reference` entry of the vertical navigation bar ![Alt text](/img/how_to/reference.png "Import Reference") click `Import Reference` and choose the right CSV file.

## Load a Current Dataset

* If no Current dataset has been published yet, just go to the `Current` entry of the vertical navigation bar ![Alt text](/img/how_to/first_current.png "Import First Current") click `Import Current` and choose the right CSV file.

* If some current dataset have already been imported and you want to add an extra one, go to the `Current` entry of the vertical navigation bar, then on the `Import` tab ![Alt text](/img/how_to/more_current.png "Import More Current")
click `Import Current` and choose the right CSV file.

31 changes: 31 additions & 0 deletions docs/docs/l2_user_guide/user-guide-installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
sidebar_position: 1
---

# Installation

To install the platform you can choose from two different approaches.

* **Using the main repository:** Clone [the main repository](https://github.com/radicalbit/radicalbit-ai-monitoring) and activate the Docker daemon. Finally, run the following command:
```bash
docker compose --profile ui --profile init-data up
```
See [README file](https://github.com/radicalbit/radicalbit-ai-monitoring/blob/main/README.md) for further information and details.

* **Using the Python Installer:** Install the [Python installer](https://pypi.org/project/radicalbit-ai-monitoring/) via `poetry` or `pip`.

* With poetry:
1. Clone the repository using `git clone https://github.com/radicalbit/radicalbit-ai-monitoring-installer.git`.
2. Move inside the repository using `cd radicalbit-ai-monitoring-installer`.
3. Install poetry using `poetry install`.

* With pip: Just run `pip install radicalbit-ai-monitoring`.

Once you have installed the Python package, activate the Docker daemon and run the following commands:

```
rbit-ai-monitoring platform install
rbit-ai-monitoring platform up
```

After all the containers are up & running, you can go to [http://localhost:5173](http://127.0.0.1:5173/) and play with the platform.
106 changes: 106 additions & 0 deletions docs/docs/l2_user_guide/user-guide-keyconcepts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
sidebar_position: 2
---

# Key Concepts

This section introduces the fundamental concepts and terminologies used within the `radicalbit-ai-monitoring` Platform. Understanding these concepts is crucial for the effective utilization of the platform.

## Model Type

The radicalbit-ai-monitoring platform supports various types of models, each suited for different types of tasks:

- **Binary Classification**: Models that categorize data into one of two possible classes (e.g., spam or not spam).
- **Multiclass Classification**: Models that categorize data into one of three or more possible classes (e.g., type of fruit: apple, orange, pear).
- **Regression**: Models that predict a continuous value (e.g., predicting house prices based on various features).

Accordingly to the `Model Type`, the platform will compute specific metrics to evaluate the performance over time.

## Data Type

The platform can handle different types of data, which are crucial for the kind of analysis achieved to evaluate the consistency of the information:

* **Tabular**: Data is organized into a table and saved in CSV format.
* **Text**: Not available yet.
* **Image**: Not available yet.

## Reference Dataset

The reference dataset is a static dataset used as a benchmark for comparison. It represents the ideal or expected data distribution and quality, against which the current dataset's performance and quality are evaluated. This dataset is typically:

- **Historical Data**: Derived from historical data that the model was trained on or validated against. It serves as a baseline to compare new incoming data.
- **Preprocessed**: Cleaned and preprocessed to ensure it represents the best possible version of the data, free from anomalies or errors.
- **Comprehensive**: Should cover all possible scenarios and variations that the model is expected to handle, ensuring it is robust and reliable.
- **Static**: Unlike the current dataset, the reference dataset remains unchanged over time to provide a consistent benchmark for monitoring purposes.

> **_TIP:_** A good example of a reference dataset is the training set.

Using the reference dataset, the platform can:

* **Detect Data Drift**: By comparing the current dataset to the reference dataset, the platform can identify significant deviations in data patterns.
* **Evaluate Model Performance**: The reference dataset provides a baseline for assessing whether the model's performance on new data aligns with its performance on known, reliable data.
* **Ensure Data Quality**: Regularly comparing the current dataset to the reference dataset helps maintain high data quality standards by highlighting inconsistencies and anomalies.

By maintaining a high-quality reference dataset, the `radicalbit-ai-monitoring` platform ensures that any changes in data or model performance can be promptly identified and addressed.

## Current Dataset

The current dataset is the most recent data being fed into the model for predictions. It should be continuously monitored to ensure consistency with the reference dataset and to detect any anomalies or changes. To achieve this, the current dataset must have the same schema as the reference.

Using the current dataset, the platform can:

- **Monitor Performance Metrics**: Continuously assess how well the model is performing on new data by tracking key metrics such as accuracy, precision, recall and others.
- **Detect Drifts**: Identify unusual patterns or anomalies in the data that may indicate issues with data collection processes or changes in underlying data distributions.
- **Adapt to Changes**: Provide insights into when the model may need retraining or adjustment due to shifts in the data, known as data drift.
- **Ensure Timeliness**: By constantly updating and analyzing the current dataset, the platform ensures that the model's predictions are based on the most up-to-date information available.

By effectively managing and monitoring the current dataset, the `radicalbit-ai-monitoring` platform helps maintain the reliability and accuracy of models in a changing environment.

## Data Quality

Data quality refers to the accuracy, completeness, and reliability of the data used by the models. High data quality is essential for the performance and reliability of the models. The platform monitors various data quality indicators and charts to ensure its integrity:

- **Descriptive Statistics**: Metrics such as mean, median, standard deviation and range are computed to summarize the central tendency and dispersion of the data.
- **Histograms**: Visual representations to identify distributions, outliers, and any potential anomalies in the data.
- **Frequency Distribution**: Charts such as bar plots are used to display the distribution of categories and highlight any imbalances or anomalies.
- **Detect Data Quality Issues**: Identify inaccuracies, inconsistencies, missing values, and outliers in the data.
- **Monitor Changes Over Time**: Track data quality metrics over time to detect any degradation or improvement in data quality.

By incorporating detailed charts and statistics for both numerical and categorical data, the `radicalbit-ai-monitoring` platform ensures comprehensive monitoring and maintenance of data quality, crucial for the robust performance of the models.

## Model Quality

Model quality is a measure of how well a model performs its task. It includes classic metrics such as accuracy, precision, recall and F1 score. The platform evaluates these metrics to ensure the model maintains high performance over time. Naturally, to compute them, the user have to include the ground truth into the dataset.

The model quality changes given the chosen Model Type and thanks to this, it includes the following metrics:

- **Binary Classification**: Accuracy, Precision, Recall, F1 score, Confusion Matrix
- **Multiclass Classification**: Accuracy, Precision, Recall, F1 score, Confusion Matrix
- **Regression**: Mean Absolute Error, Mean Squared Error, Root Mean Squared Error, R², Residual Analysis

The platform provides detailed visualizations and reports for these metrics, allowing users to:

- **Monitor Performance Trends**: Track changes in model performance over time to ensure the model remains effective.
- **Identify Weaknesses**: Pinpoint specific areas where the model may be underperforming, such as particular classes in a classification model or high-error regions in a regression model.
- **Compare Models**: Evaluate and compare the performance of different models or model versions, aiding in model selection and improvement.

By highlighting the differences in evaluation criteria and metrics for various model types, the `radicalbit-ai-monitoring` platform ensures that users can effectively assess and maintain the quality of their models.

## Data Drift

Data drift occurs when the statistical properties of the current dataset differ significantly from the reference dataset. This can affect model performance. The platform monitors for data drift to alert users of potential issues that may require model retraining or adjustment.

To detect data drift, the platform uses several statistical tests and metrics tailored to the type of data and model:

- **Chi-square Test**: Used primarily for categorical data, this test evaluates whether the distribution of categories in the current dataset significantly differs from the reference dataset. It compares the observed frequencies of categories in the current dataset against the expected frequencies derived from the reference dataset. A significant Chi-square test result indicates that the categorical distribution has changed, signalling potential data drift.
- **2-sample Kolmogorov-Smirnov (KS) Test**: This non-parametric test is used for numerical data to compare the distributions of the reference and current datasets. It evaluates the maximum difference between the cumulative distributions of the two datasets. A significant KS test result indicates that the distributions are different, suggesting data drift. The KS test is sensitive to changes in both the central tendency and the shape of the distribution.
- **Population Stability Index (PSI)**: This metric is used for both categorical and numerical data to quantify the shift in the distribution between the reference and current datasets. PSI measures the divergence between the two distributions, with higher values indicating greater drift. It is particularly useful for identifying gradual changes over time. PSI is calculated by dividing the data into bins and comparing the relative frequencies of each bin between the reference and current datasets.

Using these tests and metrics, the platform can:

- **Detect Significant Changes**: Identify when the current data distribution has shifted enough to potentially impact model performance.
- **Trigger Alerts**: Notify users when significant data drift is detected, allowing for timely intervention.
- **Guide Retraining**: Provide insights into which specific features or aspects of the data have drifted, helping to guide model retraining efforts.
- **Visualize Drift**: Offer visual representations of the drift, such as distribution plots and bar charts, to help users understand the nature and extent of the drift.

By employing these methods, the `radicalbit-ai-monitoring` platform ensures comprehensive monitoring for data drift, helping maintain the reliability and accuracy of the models in a changing data environment.
8 changes: 8 additions & 0 deletions docs/docs/l3_model_sections/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Model sections",
"position": 1,
"link": {
"type": "generated-index",
"description": "Each created models has three main sections: Overview, Reference, and Current. In this document we are thoroughly explaining each of them."
}
}
Loading
Loading