Fix Windows and onnx dtype compatibility #1886

IlyasMoutawwakil · 2024-06-03T11:45:40Z

What does this PR do?

Fixes multiple windows specific issues:

dtype mismatch: numpy uses a default int representation that depends on os. On ubuntu/macos it uses int64 but on windows it uses int32. This results in tokenizers returning input ids and attention masks in a different format than torch's default which is int64/long.
shutil.rmtree doesn't always work on windows, there's some random failures that we can see when a tempfile is being closed (cleaned up) or when directly trying to shutil.rmtree a folder.

Also seq2seq decoder used to return past kv cache in torch format all the time.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

HuggingFaceDocBuilderDev · 2024-06-03T12:22:19Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

… in torch format before

tests/onnxruntime/test_modeling.py

optimum/onnxruntime/modeling_ort.py

fxmarty · 2024-06-06T07:36:31Z

optimum/onnxruntime/modeling_ort.py

@@ -852,6 +863,39 @@ def raise_on_numpy_input_io_binding(self, use_torch: bool):
                " with model.use_io_binding = False, or pass torch.Tensor inputs instead."
            )

+    def _prepare_onnx_inputs(


Philipp did not like that kind of dynamicity with perf in mind but I have no opinion, sounds fine to me

it only applies when needed so for performance I think there's no added overhead vs what used to be done.
where things can be optimized and i think can be viewed as the optimization oriented path, is i/o binding:

pre-allocation of output buffers, not during forward but before that, and dynamically changing the size of output buffers when batch size changes.

decoder models with i/o binding where static cache implementation can be used as output buffers to reduce the overhead of their creation.

decoder models input/output synchronization which can be before and after the generation loop instead of each forward call (if that's possible).

I kind of agree, just not sure what each ORTModelForXXX is for then

optimum/onnxruntime/modeling_ort.py

…essors will create it

tengomucho · 2024-06-19T08:51:39Z

FWIW, numpy 2.0 has just been released, and int size in windows is now 64 bit! https://numpy.org/devdocs/numpy_2_0_migration_guide.html#windows-default-integer

without io binding refacto

949743e

IlyasMoutawwakil force-pushed the fix-windows-int32 branch from 2c8c843 to 949743e Compare June 3, 2024 13:29

IlyasMoutawwakil and others added 9 commits June 3, 2024 16:12

fix pkv and audio

13a6650

fix

5734e56

add t5 test

67fc4a5

fix seq2seq

aaf6cd6

fix vision2seq tests as it seems to have had always outputed kv cache…

20ccd8e

… in torch format before

fix folder deletion on windows

05582db

Merge branch 'main' into fix-windows-int32

f770d85

fix temporary directory removal on windows

06dbf89

fix merge

d584a88

IlyasMoutawwakil requested review from echarlaix, fxmarty and mht-sharma June 5, 2024 15:44

tengomucho reviewed Jun 6, 2024

View reviewed changes

tests/onnxruntime/test_modeling.py Outdated Show resolved Hide resolved

fxmarty reviewed Jun 6, 2024

View reviewed changes

IlyasMoutawwakil added 2 commits June 6, 2024 12:40

remove attention_mask creation as ORTModelForxxx's corresponding proc…

305b41a

…essors will create it

remove_directory utility function

9587bc4

tengomucho approved these changes Jun 7, 2024

View reviewed changes

fxmarty approved these changes Jun 10, 2024

View reviewed changes

IlyasMoutawwakil closed this Jun 24, 2024

IlyasMoutawwakil reopened this Jun 24, 2024

IlyasMoutawwakil merged commit aad4b8b into main Jun 24, 2024
86 of 92 checks passed

IlyasMoutawwakil deleted the fix-windows-int32 branch June 24, 2024 10:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Windows and onnx dtype compatibility #1886

Fix Windows and onnx dtype compatibility #1886

IlyasMoutawwakil commented Jun 3, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Jun 3, 2024

fxmarty Jun 6, 2024

IlyasMoutawwakil Jun 6, 2024 •

edited

Loading

fxmarty Jun 7, 2024

tengomucho commented Jun 19, 2024

Fix Windows and onnx dtype compatibility #1886

Fix Windows and onnx dtype compatibility #1886

Conversation

IlyasMoutawwakil commented Jun 3, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Jun 3, 2024

fxmarty Jun 6, 2024

Choose a reason for hiding this comment

IlyasMoutawwakil Jun 6, 2024 • edited Loading

Choose a reason for hiding this comment

fxmarty Jun 7, 2024

Choose a reason for hiding this comment

tengomucho commented Jun 19, 2024

IlyasMoutawwakil commented Jun 3, 2024 •

edited

Loading

IlyasMoutawwakil Jun 6, 2024 •

edited

Loading