Skip to content

Pdstools V4 alpha 1

Pre-release
Pre-release
Compare
Choose a tag to compare
@StijnKas StijnKas released this 30 Oct 17:17
· 33 commits to master since this release
5a42b5c

V4 brings some pretty major (and necessary) changes. A lot of them are, unfortunately, breaking - but it's for the best. pdstools is now much easier to maintain and keep consistent, and new functionality now has a much more logical place to go.

The goal is for the initial V4 release to contain most of the breaking (API-centric) changes we foresee in a long time. Then, we can of course still change the inner functionality and/or add new functions - but hopefully the most important function schemas/API don't need more changes anytime soon.

✨Highlights

  • Farewell R - you've served us well, but pdstools is now Python only
  • Introducing the Pega DX API Client
    • Starting out with support for the 24.2 Prediction Studio and Knowledge Buddy APIs
  • Major refactor of the entire codebase: consistent python naming, optional dependency groups, well-defined typehints

❌Deprecations/removals

  • The R version of pdstools has been removed. In case you still want to use the R tools, you should manually clone the repo at the V3.x tag.
  • The legacy IH utilities have been dropped. These were old parts of the codebase and untested/unused. New IH utilities are on their way!

🔨Changes

  • Consistent pythonic casing, meaning PascalCase for classes & snake_case for methods, variables & arguments
  • Much improved typehints, so it's much more obvious what the response of a given function will be
  • Fewer 'base' dependencies; different functionality is split up into 'namespaces' that all have their own set of requirements
    • The first time you invoke a method in a 'namespace', it verifies the dependencies and gives a clear warning if any are missing
  • To expand on the previous point: functionality is split up much more logically. Taking the ADMDatamart class as an example:
    • Plotting functionality is part of ADMDatamart.plot.bubble_chart() (or any other plot of course)
    • The health check and other reports are part of ADMDatamart.generate.health_check() (for instance)
    • The intermediate aggregations needed are part of ADMDatamart.aggregations.pivot() (for instance)
  • Using classmethods, we can initialize the ADMDatamart class in particular in a much more flexible way.
    • The main __init__ method of the ADMDatamart class is very simple: it expects two polars.LazyFrames; one for model_data and one for prediction_data. If you've already read in your data, simply use this
    • If, instead, you want to use the previous functionality which automatically found the most recent file in a folder, you should initialize the datamart class like ADMDatamart.from_ds_export()
    • Or, if instead, you are consuming the results of a data flow (including the OOTB Prediction Studio export), you can simply initialize the datamart class like ADMDatamart.from_dataflow_export(model_data="pattern_for_model_files*.json", predictor_data="pattern_for_predictor_files*.json"). We can also cache the files we've read in before by writing to a 'cache' file automatically - this makes things move quickly. This closes #205 as well.

Todo before release: