Issue when converting CalibratedClassifierCV #1082

paolo-sofia · 2024-03-14T11:10:57Z

Hi, I'm having issues converting a CalibratedClassifierCV model to onnx, the error I get is this:

RuntimeError: For operator SklearnCalibratedClassifierCV (type: SklearnCalibratedClassifierCV), at most 1 input(s) is(are) supported but we got 33 input(s) which are [...]

The estimator is a Pipeline containing OneHotEncoder, OrdinalEncoder and RobustScaler, with the classifier being a RandomForestClassifier.

If I try to export only the Pipeline, I don't get this error. Does the CalibratedClassifierCV works only for numerical data? Currently my dataframe contains both numerical and categorical columns. How can I fix this problem?

I'm currently using:
Python 3.12
sklearn==1.4.1.post1
skl2onnx==1.16.0
onnx==1.15.0
onnxruntime==1.17.1

xadupre · 2024-04-04T07:26:01Z

I fixed a similar issue yesterday. I guess this is the same. If you have a pipeline you share, it would help. Otherwise, I think I can replicate the issue I had with the VotingClassifier with this one and fix it with a similar solution.

paolo-sofia · 2024-04-17T09:54:27Z

Hi @xadupre, thanks for the help. I tried to use the latest version that contains your fixes, but I still get the same result. Here's the pipeline I'm using:

column_transformer: ColumnTransformer = make_column_transformer(
    (OneHotEncoder(), CATEGORICAL_COLUMNS),
    remainder="passthrough",
    n_jobs=-1,
    verbose=True
)

classifier: RandomForestClassifier = RandomForestClassifier(
    n_jobs=-1,
    random_state = 42,
    verbose = 1,
    warm_start = False,
)

pipeline: Pipeline = Pipeline(
    steps=[
        ("column_transformer", column_transformer),
        ("classifier", classifier)
    ],
    verbose=True
)

pipeline.fit(X_train, y_train)

calibrated_classifier: CalibratedClassifierCV = CalibratedClassifierCV(estimator=pipeline, n_jobs=-1, cv="prefit")
calibrated_classifier.fit(X_test, y_test)


onx = to_onnx(calibrated_classifier, X_train[:1], options={CalibratedClassifierCV: {"zipmap": False}})
with open("classifier.onnx", "wb") as f:
    f.write(onx.SerializeToString())

xadupre · 2024-05-20T09:27:35Z

Sorry for the delay, I can't find the error message in the code. Could you print the full call stack?

paolo-sofia · 2024-06-08T09:07:44Z

I'm sorry for the delay, here's the full stack trace:

25 calibrated_classifier: CalibratedClassifierCV = CalibratedClassifierCV(estimator=pipeline, n_jobs=-1, cv="prefit")
     26 calibrated_classifier.fit(X_test, y_test)
---> 29 onx = to_onnx(calibrated_classifier, X_train[:1], options={CalibratedClassifierCV: {"zipmap": False}})
     30 with open("classifier.onnx", "wb") as f:
     31     f.write(onx.SerializeToString())

File ~/git/sklearn-onnx-issue/.venv/lib/python3.12/site-packages/skl2onnx/convert.py:304, in to_onnx(model, X, name, initial_types, target_opset, options, white_op, black_op, final_types, dtype, naming, model_optim, verbose)
    302 if verbose >= 1:
    303     print("[to_onnx] initial_types=%r" % initial_types)
--> 304 return convert_sklearn(
    305     model,
    306     initial_types=initial_types,
    307     target_opset=target_opset,
    308     name=name,
    309     options=options,
    310     white_op=white_op,
    311     black_op=black_op,
    312     final_types=final_types,
    313     dtype=dtype,
    314     verbose=verbose,
    315     naming=naming,
    316     model_optim=model_optim,
    317 )

File ~/git/sklearn-onnx-issue/.venv/lib/python3.12/site-packages/skl2onnx/convert.py:206, in convert_sklearn(model, name, initial_types, doc_string, target_opset, custom_conversion_functions, custom_shape_calculators, custom_parsers, options, intermediate, white_op, black_op, final_types, dtype, naming, model_optim, verbose)
    204 if verbose >= 1:
    205     print("[convert_sklearn] convert_topology")
--> 206 onnx_model = convert_topology(
    207     topology,
    208     name,
    209     doc_string,
    210     target_opset,
    211     options=options,
    212     remove_identity=model_optim and not intermediate,
    213     verbose=verbose,
    214 )
    215 if verbose >= 1:
    216     print("[convert_sklearn] end")

File ~/git/sklearn-onnx-issue/.venv/lib/python3.12/site-packages/skl2onnx/common/_topology.py:1533, in convert_topology(topology, model_name, doc_string, target_opset, options, remove_identity, verbose)
   1522 container = ModelComponentContainer(
   1523     target_opset,
   1524     options=options,
   (...)
   1528     verbose=verbose,
   1529 )
   1531 # Traverse the graph from roots to leaves
   1532 # This loop could eventually be parallelized.
-> 1533 topology.convert_operators(container=container, verbose=verbose)
   1534 container.ensure_topological_order()
   1536 if len(container.inputs) == 0:

File ~/git/sklearn-onnx-issue/.venv/lib/python3.12/site-packages/skl2onnx/common/_topology.py:1350, in Topology.convert_operators(self, container, verbose)
   1347 for variable in operator.outputs:
   1348     _check_variable_out_(variable, operator)
-> 1350 self.call_shape_calculator(operator)
   1351 self.call_converter(operator, container, verbose=verbose)
   1353 # If an operator contains a sequence of operators,
   1354 # output variables are not necessarily known at this stage.

File ~/git/sklearn-onnx-issue/.venv/lib/python3.12/site-packages/skl2onnx/common/_topology.py:1165, in Topology.call_shape_calculator(self, operator)
   1163 else:
   1164     logger.debug("[Shape2] call infer_types for %r", operator)
-> 1165     operator.infer_types()

File ~/git/sklearn-onnx-issue/.venv/lib/python3.12/site-packages/skl2onnx/common/_topology.py:654, in Operator.infer_types(self)
    644     raise MissingShapeCalculator(
    645         "Unexpected shape calculator for alias '{}' "
    646         "and type '{}'.".format(self.type, type(self.raw_operator))
    647     )
    648 logger.debug(
    649     "[Shape-a] %r fed %r - %r",
    650     self,
    651     "".join(str(i.is_fed) for i in self.inputs),
    652     "".join(str(i.is_fed) for i in self.outputs),
    653 )
--> 654 shape_calc(self)
    655 logger.debug(
    656     "[Shape-b] %r inputs=%r - outputs=%r", self, self.inputs, self.outputs
    657 )

File ~/git/sklearn-onnx-issue/.venv/lib/python3.12/site-packages/skl2onnx/common/shape_calculator.py:31, in calculate_linear_classifier_output_shapes(operator)
     20 def calculate_linear_classifier_output_shapes(operator):
     21     """
     22     This operator maps an input feature vector into a scalar label if
     23     the number of outputs is one. If two outputs appear in this
   (...)
     29 
     30     """
---> 31     _calculate_linear_classifier_output_shapes(operator)

File ~/git/sklearn-onnx-issue/.venv/lib/python3.12/site-packages/skl2onnx/common/shape_calculator.py:43, in _calculate_linear_classifier_output_shapes(operator, decision_path, decision_leaf, enable_type_checking)
     41     n_out += 1
     42 out_range = [2, 2 + n_out]
---> 43 check_input_and_output_numbers(
     44     operator, input_count_range=1, output_count_range=out_range
     45 )
     46 if enable_type_checking:
     47     check_input_and_output_types(
     48         operator,
     49         good_input_types=[
   (...)
     54         ],
     55     )

File ~/git/sklearn-onnx-issue/.venv/lib/python3.12/site-packages/onnxconverter_common/utils.py:295, in check_input_and_output_numbers(operator, input_count_range, output_count_range)
    290     raise RuntimeError(
    291         'For operator %s (type: %s), at least %s input(s) is(are) required but we got %s input(s) which are %s'
    292         % (operator.full_name, operator.type, min_input_count, len(operator.inputs), operator.input_full_names))
    294 if max_input_count is not None and len(operator.inputs) > max_input_count:
--> 295     raise RuntimeError(
    296         'For operator %s (type: %s), at most %s input(s) is(are) supported but we got %s input(s) which are %s'
    297         % (operator.full_name, operator.type, max_input_count, len(operator.inputs), operator.input_full_names))
    299 if min_output_count is not None and len(operator.outputs) < min_output_count:
    300     raise RuntimeError(
    301         'For operator %s (type: %s), at least %s output(s) is(are) produced but we got %s output(s) which are %s'
    302         % (operator.full_name, operator.type, min_output_count, len(operator.outputs), operator.output_full_names))

RuntimeError: For operator SklearnCalibratedClassifierCV (type: SklearnCalibratedClassifierCV), at most 1 input(s) is(are) supported but we got 8 input(s) which are ['xx', 'xx', 'xx', 'xx', 'xx', 'xx', 'xx', 'xx']

I replaced the real column names with the placeholder "xx"

xadupre self-assigned this Jun 21, 2024

github-project-automation bot added this to Can Fix Aug 29, 2024

github-project-automation bot moved this to To do in Can Fix Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue when converting CalibratedClassifierCV #1082

Issue when converting CalibratedClassifierCV #1082

paolo-sofia commented Mar 14, 2024

xadupre commented Apr 4, 2024

paolo-sofia commented Apr 17, 2024

xadupre commented May 20, 2024 •

edited

Loading

paolo-sofia commented Jun 8, 2024

Issue when converting CalibratedClassifierCV #1082

Issue when converting CalibratedClassifierCV #1082

Comments

paolo-sofia commented Mar 14, 2024

xadupre commented Apr 4, 2024

paolo-sofia commented Apr 17, 2024

xadupre commented May 20, 2024 • edited Loading

paolo-sofia commented Jun 8, 2024

xadupre commented May 20, 2024 •

edited

Loading