-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[serving] Updates onnxruntime to 1.20.0 and add integration tests #2615
Conversation
@@ -16,7 +16,7 @@ ARG djl_version | |||
ARG djl_serving_version | |||
ARG python_version=3.11 | |||
ARG djl_torch_version=2.5.1 | |||
ARG djl_onnx_version=1.19.0 | |||
ARG djl_onnx_version=1.20.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo? should this be 1.20.1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Onnxruntime java only has 1.20.0: https://mvnrepository.com/artifact/com.microsoft.onnxruntime/onnxruntime_gpu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized I have 1.20.1 in title, fixed it.
serving/docker/lmi.Dockerfile
Outdated
@@ -81,6 +81,7 @@ RUN apt-get update && apt-get install -yq libaio-dev libopenmpi-dev g++ unzip cu | |||
COPY requirements-lmi.txt ./requirements.txt | |||
RUN pip3 install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu124 && pip3 cache purge | |||
RUN pip3 install -r requirements.txt \ | |||
&& pip3 install onnxruntime-gpu==$djl_onnx_version \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add this to the requirements-lmi.txt file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make onnxruntime-gpu to work, it needs to be installed after onnxruntime is installed. I found that if I put it in requirements-lmi.txt, the order is not guaranteed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to install both onnxruntime and onnxruntime-gpu? Could we replace onnxruntime in the requirements file with onnxruntime-gpu==1.20.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- djl_converter will install onnxruntime as a dependency: https://github.com/deepjavalibrary/djl/blob/master/extensions/tokenizers/src/main/python/requirements.txt#L6. I have moved djl_converter out of requirements-lmi.txt with --no-deps.
- Moved onnxruntime-gpu into requirements-lmi.txt
ARG djl_onnx_version=1.20.0 | ||
|
||
# djl converter wheel for text-embedding use case | ||
ARG djl_converter_wheel="https://publish.djl.ai/djl_converter/djl_converter-0.31.0-py3-none-any.whl" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once we update this wheel to 0.32.0, will we be able to include it in the requirements file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can make some changes to include it in the requirements file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we should be putting all python dependencies in the requirements file so that we can see any dependency conflicts at build time.
The torch dependencies are outside for two reasons:
- torch by itself is like 5-6gb, so it helps parallelize the image download
- we're using an index url to fetch the cuda 124 version.
Other than that, we should put deps into the requirements file (torch is also included in the requirements file to ensure no version conflicts with other deps)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because requirements file does not support --no-deps, I have to move djl-convert wheel outside of requirements. I can remove the onnxruntime dependency from djl-convert, so we don't need --no-deps for djl-convert.
Description
Brief description of what this PR is about