Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blocking Issue to convert VotingClassifier in ONNX format. "Not implemented yet" message. #1071

Open
joyceraraujo opened this issue Feb 15, 2024 · 9 comments

Comments

@joyceraraujo
Copy link

joyceraraujo commented Feb 15, 2024

I'm trying to convert a model that has been saved in .sav in onnx format. The model is a VotingClassifier (XGBOOST and NaiveBayes). I got the error

Traceback (most recent call last): File "/mnt/c/Users/project/convert_onnx.py", line 29, in <module> onnx_model = convert_sklearn(model,"gbdt_model", File "/mnt/c/Users/project/env/lib/python3.10/site-packages/skl2onnx/convert.py", line 208, in convert_sklearn onnx_model = convert_topology( File "/mnt/c/Users/project/env/lib/python3.10/site-packages/skl2onnx/common/_topology.py", line 1532, in convert_topology topology.convert_operators(container=container, verbose=verbose) File "/mnt/c/Users/project/env/lib/python3.10/site-packages/skl2onnx/common/_topology.py", line 1350, in convert_operators self.call_converter(operator, container, verbose=verbose) File "/mnt/c/Users/project/env/lib/python3.10/site-packages/skl2onnx/common/_topology.py", line 1133, in call_converter conv(self.scopes[0], operator, container) File "/mnt/c/Users/project/env/lib/python3.10/site-packages/skl2onnx/common/_registration.py", line 27, in __call__ return self._fct(*args) File "/mnt/c/Users/project/env/lib/python3.10/site-packages/skl2onnx/operator_converters/voting_classifier.py", line 143, in convert_voting_classifier raise NotImplementedError(NotImplementedError: flatten_transform==True is not implemented yet. You may raise an issue at https://github.com/onnx/sklearn-onnx/issues.
Is there a workaround to this problem ? It's a a blocking issue because by default flatten_transform=True. Thank you.
The code is as follow:

from sklearn.datasets import make_hastie_10_2
from sklearn.ensemble import GradientBoostingClassifier
from skl2onnx import convert_sklearn
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType
from skl2onnx import get_latest_tested_opset_version
from onnxmltools.utils import save_model
import pickle
import joblib
from skl2onnx import convert_sklearn, to_onnx, update_registered_converter
from skl2onnx.common.shape_calculator import (
    calculate_linear_classifier_output_shapes,
    calculate_linear_regressor_output_shapes,
)
from onnxmltools.convert.xgboost.operator_converters.XGBoost import convert_xgboost
from onnxmltools.convert import convert_xgboost as convert_xgboost_booster
from xgboost import XGBClassifier
model = joblib.load("model.sav")
n_features = 12
#n_features = len(model.feature_importances_)
target_opset = get_latest_tested_opset_version()
update_registered_converter(
    XGBClassifier,
    "XGBoostXGBClassifier",
    calculate_linear_classifier_output_shapes,
    convert_xgboost,
    options={"nocl": [True, False], "zipmap": [True, False, "columns"]},
)
onnx_model = convert_sklearn(model,"gbdt_model",
    initial_types=[("input", FloatTensorType([None, n_features]))],
    target_opset={"": target_opset, "ai.onnx.ml": 2})


save_model(onnx_model, 'model_converted.onnx')
@joyceraraujo joyceraraujo changed the title Issue to convert VotingClassifier in ONNX format. "Not implemented yet" message. Blocking Issue to convert VotingClassifier in ONNX format. "Not implemented yet" message. Feb 15, 2024
@attilaimre99
Copy link

Just turn of flatten_transform in the VotingClassifier with flatten_transform=False.

My example

# random forest classifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

# split the data
X_train, X_test, y_train, y_test = train_test_split(X_correct, y_correct, test_size=0.1, random_state=42)

# Initialize the Random Forest model
# model = RandomForestClassifier(n_estimators=100, random_state=42, n_jobs=-1, class_weight='balanced')
clf1 = LogisticRegression(multi_class='multinomial', random_state=1)
clf2 = RandomForestClassifier(n_estimators=100, random_state=1, class_weight='balanced')
clf3 = GaussianNB()
model = VotingClassifier(estimators=[('lr', clf1), ('rf', clf2), ('gnb', clf3)], voting='hard', flatten_transform=False)

@joyceraraujo
Copy link
Author

joyceraraujo commented Feb 19, 2024

Here's the revised message with improved clarity and corrected English:

I need to load a VotingClassifier model, which combines XGBoost and NaiveBayes, saved in the ".sav" format and convert it to ONNX format. Since I don't have access to the dataset to retrain the model, I'm directly opening it using joblib and attempting the conversion. The XGBoost version used for training the model was 1.4.2. However, I encountered several problems during the conversion:

  1. I received the following traceback:

Traceback (most recent call last): File "/mnt/c/Users/project/convert_onnx.py", line 29, in <module> onnx_model = convert_sklearn(model, "gbdt_model", File "/mnt/c/Users/project/env/lib/python3.10/site-packages/skl2onnx/convert.py", line 208, in convert_sklearn onnx_model = convert_topology( ... File "/mnt/c/Users/project/env/lib/python3.10/site-packages/skl2onnx/operator_converters/voting_classifier.py", line 143, in convert_voting_classifier raise NotImplementedError("flatten_transform==True is not implemented yet. You may raise an issue at https://github.com/onnx/sklearn-onnx/issues.") NotImplementedError: flatten_transform==True is not implemented yet. You may raise an issue at https://github.com/onnx/sklearn-onnx/issues.
This issue was resolved by executing the command: model = model.set_params(flatten_transform=False).

  1. Another issue I faced was:
File "/mnt/c/Users/dev/lib/python3.10/site-packages/onnxmltools/convert/xgboost/common.py", line 40, in get_xgb_params
    gbp = config["learner"]["gradient_booster"]["gbtree_model_param"]
KeyError: 'gbtree_model_param'

This problem was resolved by directly modifying the library and changing the parameter name to match the one compatible with the version of XGBOOST.

  1. After resolving errors 1 and 2, I encountered a new error:
    raise RuntimeError("Unable to interpret 'FGA', feature names should follow the pattern 'f%d'.")
    I attempted to change the feature names accordingly. However, the error persisted. Upon inspecting the JSON representation of the model, I noticed that the feature names represented by the 'split' field still used the old names:

{'nodeid': 0, 'depth': 0, 'split': 'ABC', 'split_condition': 6.25, 'yes': 1, 'no': 2, 'missing': 1, 'gain': 78.1462402, 'cover': 259.75, 'children': [{'nodeid': 1, 'depth': 1, 'split': 'BLK', 'split_condition': 0.550000012, 'yes': 3, 'no': 4, 'missing': 3, 'gain': 21.2281971, 'cover': 171.75, 'children': [{'nodeid': 3, 'depth': 2, 'split': 'ABC', 'split_condition': 4.55000019, 'yes': 7, 'no': 8, 'missing': 7, 'gain': 14.1838226, 'cover': 147.5, 'children': [{'nodeid': 7, 'depth': 3, 'split': 'BLK', 'split_condition': 0.25, 'yes': 11, 'no': 12, 'missing': 11, 'gain': 3.78608131, 'cover': 106.5, 'children': [{'nodeid': 11, 'depth': 4, 'split': 'BLK', 'split_condition': 0.150000006, 'yes': 15, 'no': 16, 'missing': 15, 'gain': 5.14517879, 'cover': 78.5, 'children': [{'nodeid': 15, 'depth': 5, 'split': 'MIN', 'split_condition': 8.64999962, 'yes': 17, 'no': 18, 'missing': 17, 'gain': 4.04689026, 'cover': 60, 'children': [{'nodeid': 17, 'leaf': -0.018082479, 'cover': 23.25}, {'nodeid': 18, 'leaf': -0.00116446253, 'cover': 36.75}]}, {'nodeid': 16, 'leaf': -0.0269777905, 'cover': 18.5}]}, {'nodeid': 12, 'leaf': 0.000966416614, 'cover': 28}]}, {'nodeid': 8, 'depth': 3, 'split': 'MIN', 'split_condition': 16.2000008, 'yes': 13, 'no': 14, 'missing': 13, 'gain': 1.36819792, 'cover': 41, 'children': [{'nodeid': 13, 'leaf': 0.00639292412, 'cover': 19}, {'nodeid': 14, 'leaf': 0.0183946956, 'cover': 22}]}]}, {'nodeid': 4, 'leaf': 0.029015895, 'cover': 24.25}]}, {'nodeid': 2, 'depth': 1, 'split': 'MIN', 'split_condition': 23.9500008, 'yes': 5, 'no': 6, 'missing': 5, 'gain': 8.23132324, 'cover': 88, 'children': [{'nodeid': 5, 'leaf': 0.0257856827, 'cover': 34}, {'nodeid': 6, 'depth': 2, 'split': 'BLK', 'split_condition': 0.350000024, 'yes': 9, 'no': 10, 'missing': 9, 'gain': 2.95942688, 'cover': 54, 'children': [{'nodeid': 9, 'leaf': 0.0367497504, 'cover': 23}, {'nodeid': 10, 'leaf': 0.0526438914, 'cover': 31}]}]}]}

How to solve it ?

@xadupre
Copy link
Collaborator

xadupre commented Feb 21, 2024

I think it is an issue for onnxmltools. Which version are you using?

@joyceraraujo
Copy link
Author

The version is onnxmltools 1.12.0.

@flacomalone
Copy link

I can confirm that this issue still occurs

Here's the revised message with improved clarity and corrected English:

I need to load a VotingClassifier model, which combines XGBoost and NaiveBayes, saved in the ".sav" format and convert it to ONNX format. Since I don't have access to the dataset to retrain the model, I'm directly opening it using joblib and attempting the conversion. The XGBoost version used for training the model was 1.4.2. However, I encountered several problems during the conversion:

  1. I received the following traceback:

Traceback (most recent call last): File "/mnt/c/Users/project/convert_onnx.py", line 29, in <module> onnx_model = convert_sklearn(model, "gbdt_model", File "/mnt/c/Users/project/env/lib/python3.10/site-packages/skl2onnx/convert.py", line 208, in convert_sklearn onnx_model = convert_topology( ... File "/mnt/c/Users/project/env/lib/python3.10/site-packages/skl2onnx/operator_converters/voting_classifier.py", line 143, in convert_voting_classifier raise NotImplementedError("flatten_transform==True is not implemented yet. You may raise an issue at https://github.com/onnx/sklearn-onnx/issues.") NotImplementedError: flatten_transform==True is not implemented yet. You may raise an issue at https://github.com/onnx/sklearn-onnx/issues. This issue was resolved by executing the command: model = model.set_params(flatten_transform=False).

  1. Another issue I faced was:
File "/mnt/c/Users/dev/lib/python3.10/site-packages/onnxmltools/convert/xgboost/common.py", line 40, in get_xgb_params
    gbp = config["learner"]["gradient_booster"]["gbtree_model_param"]
KeyError: 'gbtree_model_param'

This problem was resolved by directly modifying the library and changing the parameter name to match the one compatible with the version of XGBOOST.

  1. After resolving errors 1 and 2, I encountered a new error:
    raise RuntimeError("Unable to interpret 'FGA', feature names should follow the pattern 'f%d'.")
    I attempted to change the feature names accordingly. However, the error persisted. Upon inspecting the JSON representation of the model, I noticed that the feature names represented by the 'split' field still used the old names:

{'nodeid': 0, 'depth': 0, 'split': 'ABC', 'split_condition': 6.25, 'yes': 1, 'no': 2, 'missing': 1, 'gain': 78.1462402, 'cover': 259.75, 'children': [{'nodeid': 1, 'depth': 1, 'split': 'BLK', 'split_condition': 0.550000012, 'yes': 3, 'no': 4, 'missing': 3, 'gain': 21.2281971, 'cover': 171.75, 'children': [{'nodeid': 3, 'depth': 2, 'split': 'ABC', 'split_condition': 4.55000019, 'yes': 7, 'no': 8, 'missing': 7, 'gain': 14.1838226, 'cover': 147.5, 'children': [{'nodeid': 7, 'depth': 3, 'split': 'BLK', 'split_condition': 0.25, 'yes': 11, 'no': 12, 'missing': 11, 'gain': 3.78608131, 'cover': 106.5, 'children': [{'nodeid': 11, 'depth': 4, 'split': 'BLK', 'split_condition': 0.150000006, 'yes': 15, 'no': 16, 'missing': 15, 'gain': 5.14517879, 'cover': 78.5, 'children': [{'nodeid': 15, 'depth': 5, 'split': 'MIN', 'split_condition': 8.64999962, 'yes': 17, 'no': 18, 'missing': 17, 'gain': 4.04689026, 'cover': 60, 'children': [{'nodeid': 17, 'leaf': -0.018082479, 'cover': 23.25}, {'nodeid': 18, 'leaf': -0.00116446253, 'cover': 36.75}]}, {'nodeid': 16, 'leaf': -0.0269777905, 'cover': 18.5}]}, {'nodeid': 12, 'leaf': 0.000966416614, 'cover': 28}]}, {'nodeid': 8, 'depth': 3, 'split': 'MIN', 'split_condition': 16.2000008, 'yes': 13, 'no': 14, 'missing': 13, 'gain': 1.36819792, 'cover': 41, 'children': [{'nodeid': 13, 'leaf': 0.00639292412, 'cover': 19}, {'nodeid': 14, 'leaf': 0.0183946956, 'cover': 22}]}]}, {'nodeid': 4, 'leaf': 0.029015895, 'cover': 24.25}]}, {'nodeid': 2, 'depth': 1, 'split': 'MIN', 'split_condition': 23.9500008, 'yes': 5, 'no': 6, 'missing': 5, 'gain': 8.23132324, 'cover': 88, 'children': [{'nodeid': 5, 'leaf': 0.0257856827, 'cover': 34}, {'nodeid': 6, 'depth': 2, 'split': 'BLK', 'split_condition': 0.350000024, 'yes': 9, 'no': 10, 'missing': 9, 'gain': 2.95942688, 'cover': 54, 'children': [{'nodeid': 9, 'leaf': 0.0367497504, 'cover': 23}, {'nodeid': 10, 'leaf': 0.0526438914, 'cover': 31}]}]}]}

How to solve it ?

I can confirm that the issue showing KeyError: 'gbtree_model_param' still occurs when using onnx==1.15.0 onnxconverter-common==1.14.0 onnxmltools==1.11.2 onnxruntime==1.17.0 xgboost==2.0.3

@vloison
Copy link

vloison commented Oct 30, 2024

Hello,

I ran in the same issue while trying to load an already trained voting classifier using three GradientBoostingClassifier models. Setting flatten_transform to False is not an option here because I can't re-train the model on the training data. Any insights on how to circumvent this issue ?

@vloison
Copy link

vloison commented Oct 30, 2024

It turns out that the issue is indeed solved by changing flatten_transform, i.e., adding model.set_params(flatten_transform=False before using convert_sklearn.

@xadupre
Copy link
Collaborator

xadupre commented Oct 30, 2024

Are you unblocked or do you still need this feature?

@vloison
Copy link

vloison commented Oct 30, 2024

I am unblocked thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants