Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChatQnA - add files for deploy ChatQnA application on AMD ROCm with vLLM service #1181

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open
18 changes: 18 additions & 0 deletions ChatQnA/docker_compose/amd/gpu/rocm-vllm/Dockerfile-vllm
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
FROM rocm/vllm:rocm6.2_mi300_ubuntu20.04_py3.9_vllm_0.6.4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's put the dockerfile under GenAIExamples/ChatQnA/, it can be named as Dockerfile.vllm_rocm.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good. I will place the file according to the proposed path and adapt the scripts


# Set the working directory
WORKDIR /workspace

# Copy the api_server.py into the image
ADD https://raw.githubusercontent.com/ROCm/vllm/a466f09d7f20ca073f21e3f64b8c9487e4c4ff4b/vllm/entrypoints/sync_openai/api_server.py /workspace/api_server.py

# Expose the port used by the API server
EXPOSE 8011

# Set environment variables
ENV HUGGINGFACE_HUB_CACHE=/workspace
ENV WILM_USE_TRITON_FLASH_ATTENTION=0
ENV PYTORCH_JIT=0

# Set the entrypoint to the api_server.py script
ENTRYPOINT ["python3", "/workspace/api_server.py"]
Loading
Loading