[do-not-merge] IBM 20241121#237
Closed
fialhocoelho wants to merge 463 commits intomain from ibm-20241121
+53,016-18,665
Commits
This pull request is big! We're only showing the most recent 250 commits
Commits on Nov 4, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
[Bugfix][CI/Build][Hardware][AMD] Shard ID parameters in AMD tests running parallel jobs (vllm-project#9279)
authored- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 5, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 6, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 7, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
[Misc] Add Gamma-Distribution Request Generation Support for Serving Benchmark. (vllm-project#10105)
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 8, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Disable spec-decode + chunked-prefill for draft models with tensor parallelism > 1 (vllm-project#10136)
authored- authored
- authored
- authored
- authored
- authored
Commits on Nov 9, 2024
[Kernel][Triton] Add Triton implementation for scaled_mm_triton to support fp8 and int8 SmoothQuant, symmetric case (vllm-project#9857)
authored- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 10, 2024
- authored
- authored
Commits on Nov 11, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 12, 2024
- authored
- authored
- authored
- authored
- authored
[BugFix] Do not raise a
ValueError
whentool_choice
is set to the supportednone
option andtools
are not defined. (vllm-project#10000)authored- authored
- authored
- authored
[V1] Use pickle for serializing EngineCoreRequest & Add multimodal inputs to EngineCoreRequest (vllm-project#10245)
authored- authored
- authored
- authored
- authored
- authored
Commits on Nov 13, 2024
- authored
- authored
- authored
- authored
[Model] Add support for Qwen2-VL video embeddings input & multiple image embeddings input with varied resolutions (vllm-project#10221)
authored[Model] Adding Support for Qwen2VL as an Embedding Model. Using MrLight/dse-qwen2-2b-mrl-v1 (vllm-project#9944)
- authored
- authored
- authored
- authored
- authored
Commits on Nov 14, 2024
- authored
- authored
- authored
- authored
[BugFix]: properly deserialize
tool_calls
iterator before processing by mistral-common when MistralTokenizer is used (vllm-project#9951)authored- authored
- authored
- authored
- authored
Commits on Nov 15, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
[Bugfix] Ensure special tokens are properly filtered out for guided structured output with MistralTokenizer (vllm-project#10363)
authored- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 16, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 17, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 18, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- committed
- committed
- committed
- committed
- committed
- committed