-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adapt to latest vllm changes #632
base: main
Are you sure you want to change the base?
Conversation
4aad447
to
6bfddb6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, matches changes for "GenAIExamples" in opea-project/GenAIExamples#1210 (and corresponding PR for "GenAIComps" repo).
@poussa ?
- Remove --eager-enforce on hpu to improve performance - Refactor to the upstream docker entrypoint changes Fixes issue opea-project#631. Signed-off-by: Lianhao Lu <[email protected]>
Investigating the CI failure for "agent, gaudi, ci-gaudi-values, common" test, I see 2 bugs:
(Besides the size, I think another model would be nicer as default due to license used on Meta's models.) |
My vLLM PR includes same agent and (relevant) vLLM component changes as yours, but strangely that same CI agent test succeeded for it: https://github.com/opea-project/GenAIInfra/actions/runs/12262626198/job/34212355870?pr=610 ? EDIT: today's push on my PR got the same issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--tensor-parallel-size
option can be dropped, as 1
value is the default:
https://docs.vllm.ai/en/latest/usage/engine_args.html
we need to wait for PR #642 to land-in first |
Description
Issues
Fixes #631.
Type of change
List the type of change like below. Please delete options that are not relevant.
Dependencies
List the newly introduced 3rd party dependency if exists.
Tests
Describe the tests that you ran to verify your changes.