Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treelite gives different predictions than base XGBoost model #585

Open
juliuscoburger opened this issue Sep 23, 2024 · 4 comments
Open

Treelite gives different predictions than base XGBoost model #585

juliuscoburger opened this issue Sep 23, 2024 · 4 comments

Comments

@juliuscoburger
Copy link

I noticed that my model returns different scores than the original model. I was able to boil the issue down to using a base_score during training. Can it be that this value is not being translated?

Code to replicate the issue:

import numpy as np
import xgboost as xgb
import treelite

np.random.seed(42)
N = 10
X = np.random.random((N, 10))
y = np.random.random((N,))
dtrain = xgb.DMatrix(X, label=y)
bst = xgb.train({
    'objective': 'count:poisson'
}, dtrain, 10)
bst.save_model('/tmp/bst.json')
tl_model = treelite.frontend.load_xgboost_model('/tmp/bst.json')
# Treelite gives the same predictions as xgboost
np.testing.assert_almost_equal(treelite.gtil.predict(tl_model, data=X).squeeze(), bst.predict(dtrain))


# Poisson will fail for sufficiently high predictions, see https://github.com/dmlc/xgboost/issues/10486
y = np.random.random((N,)) * 3000
dtrain = xgb.DMatrix(X, label=y)
# But the issue can be mitigated by setting sufficiently high base score
bst = xgb.train({
    'objective': 'count:poisson',
    'base_score': 3000
}, dtrain, 10)
bst.save_model('/tmp/bst.json')

tl_model = treelite.frontend.load_xgboost_model('/tmp/bst.json')
# Unfortunatelly treelite now gives different predictions
np.testing.assert_almost_equal(treelite.gtil.predict(tl_model, data=X).squeeze(), bst.predict(dtrain))
@hcho3
Copy link
Collaborator

hcho3 commented Sep 26, 2024

It looks like the floating-point error starts to creep in, from two sources:

  • Use of large base_scores
  • Order of summation is different in XGBoost's predictor and Treelite GTIL

The check passes if you relax the required tolerance:

 np.testing.assert_almost_equal(treelite.gtil.predict(tl_model, data=X).squeeze(), bst.predict(dtrain), decimal=2)

@juliuscoburger
Copy link
Author

The check is just there to showcase that the scores are not equal. I was under the impression that GTIL always returns the same scores.

@hcho3
Copy link
Collaborator

hcho3 commented Sep 26, 2024

Can it be that this value is not being translated?

I double checked and base_scores is being properly translated and handled. So the error is not due to logic error.

I was under the impression that GTIL always returns the same scores.

GTIL may evaluate trees and leaf nodes in different order as XGBoost. Addition of floating-point values is not associative (a + (b + c) != (a+b) +c in general), and error may accumulate, especially if some values in the sum are much larger than the others, like in this example.

To minimize error due to floating-point arithmetic, consider scaling the target by using StandardScaler.

@hcho3 hcho3 closed this as completed Sep 26, 2024
@hcho3 hcho3 reopened this Sep 26, 2024
@hcho3
Copy link
Collaborator

hcho3 commented Sep 26, 2024

I'll probably have to add a note to the documentation for GTIL about possibility of floating-point error and how to mitigate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants