-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync main with upstream #18
Conversation
Co-authored-by: Lei Wen <[email protected]> Co-authored-by: Cade Daniel <[email protected]> Co-authored-by: Cody Yu <[email protected]>
…-project#4660) [Core][Distributed] support both cpu and device tensor in broadcast tensor dict (vllm-project#4660)
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: z103cb The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…project#4592) Co-authored-by: Cade Daniel <[email protected]>
…project#4400) Co-authored-by: Michael Goin <[email protected]>
@z103cb: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
/lgtm |
This PR implements a subset of the metrics from the TGIS image. I tried to make sure that everything from our current ops dashboard is supported. These are: - tgi_tokenize_request_tokens - tgi_tokenize_request_input_count - tgi_request_input_count - tgi_request_failure - tgi_request_queue_duration - tgi_queue_size - tgi_batch_current_size - tgi_batch_inference_duration - tgi_request_input_length - tgi_request_generated_tokens --------- Signed-off-by: Joe Runde <[email protected]>
make the vllm setup mode configurable and make install mode as defaul…
Synchronizing main with upstream. Supersedes #15