Adding optimum option for PredictEngine #492

wwymak · 2024-12-08T18:50:21Z

Related Issue

Resolves #488

Checklist

I have read the CONTRIBUTING guidelines.
I have added tests to cover my changes.
I have updated the documentation (docs folder) accordingly.

Additional Notes

Add any other context about the PR here.

…ves the same output as torch implementation

greptile-apps

PR Summary

This PR adds ONNX/Optimum support for text classification models, enabling optimized inference through the PredictEngine with ONNX runtime integration.

Added new OptimumClassifier class in /libs/infinity_emb/infinity_emb/transformer/classifier/optimum.py with ONNX runtime support and model optimization capabilities
Implemented device-aware quantization preferences in OptimumClassifier for CPU and OpenVINO providers
Added comprehensive test suite in /test_optimum_classifier.py comparing against HuggingFace pipeline outputs
Extended PredictEngine enum in utils.py to include Optimum as a new inference engine option
Disabled IO binding in ONNX runtime for better compatibility via model.use_io_binding = False

_{💡 (1/5) You can manually trigger the bot by mentioning @greptileai in a comment!}

_{3 file(s) reviewed, 4 comment(s)}
_{Edit PR Review Bot Settings | Greptile}

greptile-apps · 2024-12-08T18:50:51Z

libs/infinity_emb/tests/unit_test/transformer/classifier/test_optimum_classifier.py

+        engine_args=EngineArgs(
+        model_name_or_path=model_name,
+        device="cuda" if torch.cuda.is_available() else "cpu",
+        )  # type: ignore
+    )


syntax: Indentation is inconsistent in EngineArgs constructor

greptile-apps · 2024-12-08T18:50:53Z

libs/infinity_emb/infinity_emb/transformer/classifier/optimum.py

+import copy
+import os
+
+import numpy as np


style: numpy is imported but never used in this file

greptile-apps · 2024-12-08T18:50:54Z

libs/infinity_emb/infinity_emb/transformer/classifier/optimum.py

+    def encode_post(self, classes) -> dict[str, float]:
+        """runs post encoding such as normalization"""
+        return classes


logic: incorrect type hint - method returns list[list[dict]] based on test file, not dict[str, float]

greptile-apps · 2024-12-08T18:50:54Z

libs/infinity_emb/infinity_emb/transformer/classifier/optimum.py

+if CHECK_TRANSFORMERS.is_available:
+    from transformers import AutoConfig, AutoTokenizer, pipeline  # type: ignore[import-untyped]


style: AutoConfig is imported but never used

michaelfeil · 2024-12-08T19:29:46Z

Thanks for opening this & opening an issue before! Looks good so far.

There is a “make precommit” command in the malefile.
that should do all lint / fix / testing :)

codecov-commenter · 2024-12-08T19:33:50Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 84.21053% with 6 lines in your changes missing coverage. Please review.

Project coverage is 79.58%. Comparing base (be48378) to head (6df694e).

Files with missing lines	Patch %	Lines
...emb/infinity_emb/transformer/classifier/optimum.py	88.23%	4 Missing ⚠️
...ibs/infinity_emb/infinity_emb/transformer/utils.py	50.00%	2 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #492      +/-   ##
==========================================
+ Coverage   79.53%   79.58%   +0.05%     
==========================================
  Files          41       42       +1     
  Lines        3430     3468      +38     
==========================================
+ Hits         2728     2760      +32     
- Misses        702      708       +6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

michaelfeil · 2024-12-10T07:09:22Z

Thanks for all the work!

wendy mak added 6 commits December 6, 2024 21:23

adding optimum onnx option for classification

e0811b8

removing a few unneeded bits from optimum.py

d518b30

adding missing required args from optimum pipeline to make sure it gi…

c72c2c6

…ves the same output as torch implementation

adding unit test

65a71f2

Merge remote-tracking branch 'origin/main' into onnx-predict-engine

33fe37b

remove a few stray things that are not needed

0c88c15

greptile-apps bot reviewed Dec 8, 2024

View reviewed changes

wendy mak added 2 commits December 8, 2024 18:55

fixing code styles

d0e8043

minor linting fix

Loading
Loading status checks…

c301c69

fix failing test

Loading
Loading status checks…

6df694e

michaelfeil merged commit c335df8 into michaelfeil:main Dec 10, 2024
36 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding optimum option for PredictEngine #492

Adding optimum option for PredictEngine #492

wwymak commented Dec 8, 2024

greptile-apps bot left a comment

greptile-apps bot Dec 8, 2024

greptile-apps bot Dec 8, 2024

greptile-apps bot Dec 8, 2024

greptile-apps bot Dec 8, 2024

michaelfeil commented Dec 8, 2024 •

edited

Loading

codecov-commenter commented Dec 8, 2024 •

edited

Loading

michaelfeil commented Dec 10, 2024

		if CHECK_TRANSFORMERS.is_available:
		from transformers import AutoConfig, AutoTokenizer, pipeline # type: ignore[import-untyped]

Adding optimum option for PredictEngine #492

Adding optimum option for PredictEngine #492

Conversation

wwymak commented Dec 8, 2024

Related Issue

Checklist

Additional Notes

greptile-apps bot left a comment

Choose a reason for hiding this comment

PR Summary

greptile-apps bot Dec 8, 2024

Choose a reason for hiding this comment

greptile-apps bot Dec 8, 2024

Choose a reason for hiding this comment

greptile-apps bot Dec 8, 2024

Choose a reason for hiding this comment

greptile-apps bot Dec 8, 2024

Choose a reason for hiding this comment

michaelfeil commented Dec 8, 2024 • edited Loading

codecov-commenter commented Dec 8, 2024 • edited Loading

Codecov Report

michaelfeil commented Dec 10, 2024

michaelfeil commented Dec 8, 2024 •

edited

Loading

codecov-commenter commented Dec 8, 2024 •

edited

Loading