-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Misc] Minimum requirements for SageMaker compatibility #11575
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
…nate docker build Signed-off-by: Nathan Azrak <[email protected]>
Signed-off-by: Alex He <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: ccjincong <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: Erez Schwartz <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
…ghts with same suffix (#11566) Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Chen Zhang <[email protected]>
Signed-off-by: Nathan Azrak <[email protected]>
Will remake this to clean the git history and sign properly |
Fixes #11557
Implements
/ping
and/invocations
, and creates an alternate dockerfile, identical tovllm-openai
but with entrypoint setting port to 8080.Since the OpenAI server is more "production-ready" we use this functionality and its handlers as the base.
Considerations:
Dockerfile
The Dockerfile order has changed, defining the
vllm-sagemaker
image first, then building from that forvllm-openai
.This avoids repetition of the additional dependencies, and still defines
vllm-openai
last, so that it is the default fordocker build
. If we don't like usingvllm-sagemaker
as the base forvllm-openai
we can simply repeat the additional requirements between both, and revert tofrom vllm-base as vllm-openai
.Routing
model_validate
messages
is in the request to determine whether it is a chat inputNote that these changes make no changes to other images or APIs. IMO it should be ok to integrate them for the purpose of expanding to SageMaker use cases, without offering the full flexibility of being able to make requests to all the endpoints.
I have tested the new endpoints locally. I will be able to test building and deploying on SageMaker some time in the next couple of weeks, but welcome feedback.