Releases: pegasystems/pega-datascientist-tools
Releases · pegasystems/pega-datascientist-tools
V2.1: Analyze ADM Trees
V2.1: Analyze ADM Trees
This release introduces a new class: ADMTrees. With ADMTrees you can analyze and visualize ADM Gradient boosting models.
Some new features include:
- Analyze predictors and their splits
- Visualize individual trees within the ensemble
- Visualize the prediction path of the model, given input data
- Replicate the scoring of the model
- Visualize the contribution of each tree towards the final score
For inspiration on how to use ADMTrees, check out this example: https://pegasystems.github.io/cdh-datascientist-tools/Python/articles/AGBModelVisualisation.html. There you can also find the API reference documentation.
Python CDH Tools 2.0
This version brings many major improvements to the Python version of CDH Tools. Please see below for a quick summary:
- This version now supports loading cdh tools without having to clone the entire Github repository simply by running the following command:
pip3 install git+https://github.com/pegasystems/cdh-datascientist-tools.git
- It is then possible to import the ADMDatamart class with the following syntax:
from cdhtools import ADMDatamart
- For quickly testing things out, you can import the CDH Sample dataset with a simple command. You can import and use it as such:
from cdhtools import datasets
Sample = datasets.CDHSample()
Sample.plotPerformanceSuccessRateBubbleChart()
See also Example_ADM_Analysis.ipynb in examples/datamart, where it is used as well.
- An additional plotting library is now supported: Plotly. It is chosen by default, but to revert back to matplotlib simply give the argument 'plotting_engine = "mpl"' to either the ADMDatamart class initialization or an individual plotting function.
- New visualisations were also added with the introduction of Plotly: Treemap, ModelsByPositives, OverTime & ResponseGain.
- There is now a Python plot gallery, you can find it under examples/plot_gallery.
- Unit tests are now added, improving reliability.
- Documentation is much improved - you can refer to either ADMDatamart.py or plot_base.py for information about the purpose and arguments for each function.
- With Plotly, facetting is now much easier as well. Simply supply the 'facets' arguments with a list of context keys to facet on and, for compatible plots, different facets will be created.
- It is now possible to easily extract the treatment out of the pyName column with the 'extract_treatment' argument to the ADMDatamart class. Example: ADMDatamart('data', extract_treatment='pyName').
- Various bugfixes, such as SettingWithCopyWarning errors, a minor miscalculation in getting the latest predictors and a new way to get the latest file by looking at the timestamp of the zip file names.