Skip to content

How does Khiops handle datasets with imbalanced classes? #474

Answered by lucaurelien
lucaurelien asked this question in Q&A
Discussion options

You must be logged in to vote

Khiops is robust to class imbalance, so rebalancing techniques are generally not required.

However, class imbalance often necessitates collecting large amounts of data to gather sufficient information about the minority class. This can result in very large datasets and significantly longer training times, potentially exceeding reasonable limits. In such cases, rebalancing the dataset can be beneficial.

The standard approach is to retain all examples from the minority class and undersample the majority class. This strategy reduces computation time while preserving critical information. Never oversample the minority class, as duplicate instances lead to significant overfitting.

Best practic…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by folmos-at-orange
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
1 participant