Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ORT CI #1875

Merged
merged 24 commits into from
May 29, 2024
Merged

Fix ORT CI #1875

merged 24 commits into from
May 29, 2024

Conversation

IlyasMoutawwakil
Copy link
Member

What does this PR do?

Fixes ort ci failures due to 1.18

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Who can review?

@IlyasMoutawwakil
Copy link
Member Author

IlyasMoutawwakil commented May 24, 2024

one test is still failing during generation with old model optimum/gpt2
https://github.com/huggingface/optimum/blob/main/tests/onnxruntime/test_modeling.py#L2277
run pytest -v -s tests/onnxruntime -k "test_inference_old_onnx_model_1" to see the error:

RuntimeError: Error in execution: Non-zero status code returned while running Where node. Name:'Where_393' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/math/element_wise_ops.h:640 onnxruntime::Broadcaster::Broadcaster(gsl::span<const long int>, gsl::span<const long int>) largest <= 1 was false. Can broadcast 0 by 0 or 1. 5 is invalid.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@IlyasMoutawwakil
Copy link
Member Author

ort tests are passing locally, the issue with broadcasting was solved.
but for some reason output ids are matching locally and not on the runner (two tests with old onnx model)

@echarlaix
Copy link
Collaborator

but for some reason output ids are matching locally and not on the runner (two tests with old onnx model)

Do you know where this could come from ?

.github/workflows/test_onnxruntime.yml Outdated Show resolved Hide resolved
optimum/onnxruntime/modeling_decoder.py Outdated Show resolved Hide resolved
@@ -356,62 +356,45 @@ def quantize(
)

quantizer_factory = QDQQuantizer if use_qdq else ONNXQuantizer
# TODO: maybe this logic can be moved to a method in the configuration class (get_ort_quantizer_kwargs())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the config should not be aware of the ORTQuantizer class right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the quant config already contains everything and can infer which quantizer will use its kwargs (from format and is_static

use_qdq = quantization_config.is_static and quantization_config.format == QuantFormat.QDQ
)

@echarlaix echarlaix requested a review from michaelbenayoun May 28, 2024 14:41
Copy link
Member

@michaelbenayoun michaelbenayoun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines -82 to -83
pytest -n auto -m "not run_in_series" --durations=0 -vs onnxruntime
pytest -m "run_in_series" --durations=0 onnxruntime
Copy link
Member Author

@IlyasMoutawwakil IlyasMoutawwakil May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the order here and started seeing new errors in windows related to input dtype that I didn't see before.
I just noticed but apparently depending on the os, errors propagate into the workflow differently;
in linux based runners (ubuntu), this will run the first command and exit with non-zero code if it fails.
in windows based runners, this will run the first command, and then the second, whether the first succeeds or fails, and will only check the exit code of the last one.

instances:

this is probably due to difference between bash and powershell
@echarlaix @michaelbenayoun @JingyaHuang @mht-sharma @regisss @fxmarty

@echarlaix echarlaix merged commit fbbc408 into main May 29, 2024
44 of 47 checks passed
@echarlaix echarlaix deleted the fix-ort-ci branch May 29, 2024 12:07
Comment on lines +25 to +31
- name: Free Disk Space (Ubuntu)
if: matrix.os == 'ubuntu-20.04'
uses: jlumbroso/free-disk-space@main
with:
tool-cache: false
swap-storage: false
large-packages: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked in #1875.
It takes half the time (2-3 min -> 1 min) and shows how much space was freed in each step. Either way we won't need it when we start using intel-cpu runners.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

)
outputs = model.generate(**tokens, num_beams=1, do_sample=False, min_new_tokens=30, max_new_tokens=30)
self.assertTrue(torch.allclose(outputs_onnx, outputs))
outputs = model.generate(**tokens, num_beams=1, do_sample=False, min_new_tokens=10, max_new_tokens=10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why reducing the number of new tokens?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was trying to figure out where the failing was coming from and forgot to reset it to 30. Will do that in the windows PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants