Skip to content

Releases: pegasystems/pega-datascientist-tools

V2.1: Analyze ADM Trees

25 May 11:32
9e191a5
Compare
Choose a tag to compare

V2.1: Analyze ADM Trees

This release introduces a new class: ADMTrees. With ADMTrees you can analyze and visualize ADM Gradient boosting models.

Some new features include:

  • Analyze predictors and their splits
  • Visualize individual trees within the ensemble
  • Visualize the prediction path of the model, given input data
  • Replicate the scoring of the model
  • Visualize the contribution of each tree towards the final score

For inspiration on how to use ADMTrees, check out this example: https://pegasystems.github.io/cdh-datascientist-tools/Python/articles/AGBModelVisualisation.html. There you can also find the API reference documentation.

Python CDH Tools 2.0

30 Mar 14:39
Compare
Choose a tag to compare

This version brings many major improvements to the Python version of CDH Tools. Please see below for a quick summary:

  • This version now supports loading cdh tools without having to clone the entire Github repository simply by running the following command:
    • pip3 install git+https://github.com/pegasystems/cdh-datascientist-tools.git
    • It is then possible to import the ADMDatamart class with the following syntax:
      from cdhtools import ADMDatamart
  • For quickly testing things out, you can import the CDH Sample dataset with a simple command. You can import and use it as such:
from cdhtools import datasets 
Sample = datasets.CDHSample()
Sample.plotPerformanceSuccessRateBubbleChart()

See also Example_ADM_Analysis.ipynb in examples/datamart, where it is used as well.

  • An additional plotting library is now supported: Plotly. It is chosen by default, but to revert back to matplotlib simply give the argument 'plotting_engine = "mpl"' to either the ADMDatamart class initialization or an individual plotting function.
  • New visualisations were also added with the introduction of Plotly: Treemap, ModelsByPositives, OverTime & ResponseGain.
  • There is now a Python plot gallery, you can find it under examples/plot_gallery.
  • Unit tests are now added, improving reliability.
  • Documentation is much improved - you can refer to either ADMDatamart.py or plot_base.py for information about the purpose and arguments for each function.
  • With Plotly, facetting is now much easier as well. Simply supply the 'facets' arguments with a list of context keys to facet on and, for compatible plots, different facets will be created.
  • It is now possible to easily extract the treatment out of the pyName column with the 'extract_treatment' argument to the ADMDatamart class. Example: ADMDatamart('data', extract_treatment='pyName').
  • Various bugfixes, such as SettingWithCopyWarning errors, a minor miscalculation in getting the latest predictors and a new way to get the latest file by looking at the timestamp of the zip file names.