Skip to content

Commit

Permalink
Read use_external_data_format from ORTConfig file
Browse files Browse the repository at this point in the history
When quantizing the models >2Gb, it's important to
set the flag use_external_data_format to 'true',
since otherwise the quantization will fail due to

`ValueError: Message onnx.ModelProto exceeds maximum protobuf size of 2GB`

However, currently there is no way to set the parameter
when using optimum-cli because there is no such
command option. Theoretically, it could be set when
using ORTConfig file with -c comman flag, because one
of the configuration parameters in it is use_external_data_format.
In fact, the optimum code ignores it and does not pass it
in quantize() function.

The goal of this change is to close this gap.
  • Loading branch information
idruker-cerence committed Jun 25, 2024
1 parent 8b43dd2 commit b059e1a
Showing 1 changed file with 7 additions and 2 deletions.
9 changes: 7 additions & 2 deletions optimum/onnxruntime/subpackage/commands/quantize.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ def run(self):

save_dir = self.args.output
quantizers = []
use_external_data_format = False

quantizers = [
ORTQuantizer.from_pretrained(self.args.onnx_model, file_name=model.name)
Expand All @@ -96,7 +97,11 @@ def run(self):
"TensorRT quantization relies on static quantization that requires calibration, which is currently not supported through optimum-cli. Please adapt Optimum static quantization examples to run static quantization for TensorRT: https://github.com/huggingface/optimum/tree/main/examples/onnxruntime/quantization"
)
else:
qconfig = ORTConfig.from_pretrained(self.args.config).quantization
config = ORTConfig.from_pretrained(self.args.config)
qconfig = config.quantization
use_external_data_format = config.use_external_data_format

for q in quantizers:
q.quantize(save_dir=save_dir, quantization_config=qconfig)
q.quantize(
save_dir=save_dir, quantization_config=qconfig, use_external_data_format=use_external_data_format
)

0 comments on commit b059e1a

Please sign in to comment.