-
-
Notifications
You must be signed in to change notification settings - Fork 5k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[benchmark] Remove dependency for H100 benchmark step
ci/build
#11572
opened Dec 27, 2024 by
khluu
Loading…
[Model] Support InternLM2 Reward models
documentation
Improvements or additions to documentation
#11571
opened Dec 27, 2024 by
Isotr0py
Loading…
[Frontend] Improve Error Handling
documentation
Improvements or additions to documentation
frontend
needs-rebase
#11570
opened Dec 27, 2024 by
robertgshaw2-neuralmagic
Loading…
[Bugfix] Move the _touch(computed_blocks) call in the allocate_slots method to after the check for allocating new blocks.
#11565
opened Dec 27, 2024 by
sakunkun
Loading…
[Model] LoRA with lm_head and embed_tokens fully trained - 3
#11558
opened Dec 27, 2024 by
sergeykochetkov
Loading…
[Frontend] [Bugfix] Refactor tool parsers and simplify the tool parsing interface.
ci/build
frontend
#11554
opened Dec 27, 2024 by
elementary-particle
•
Draft
[Misc] Speculative Decoding: Adding Mean Accept Length Metric
#11552
opened Dec 27, 2024 by
MMuzzammil1
Loading…
[V1] [5/N] API Server: unify ONLY add when PR is ready to merge/full CI is needed
Detokenizer
and EngineCore
input
ready
#11545
opened Dec 27, 2024 by
robertgshaw2-neuralmagic
Loading…
Bounded peak memory in Top-P-Top-K with chunked sorting
ci/build
#11544
opened Dec 27, 2024 by
yangalan123
Loading…
[Benchmark] Add benchmark script for CPU offloading
ready
ONLY add when PR is ready to merge/full CI is needed
#11533
opened Dec 26, 2024 by
ApostaC
Loading…
[Core] Block Allocator to support KV cache CPU offloading
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
#11532
opened Dec 26, 2024 by
ApostaC
Loading…
[Core] Performance optimization for swap_blocks by cuda kernels
ready
ONLY add when PR is ready to merge/full CI is needed
#11531
opened Dec 26, 2024 by
ApostaC
Loading…
[BugFix] Fix parameter names and
process_after_weight_loading
for W4A16 MoE Group Act Order
#11528
opened Dec 26, 2024 by
dsikka
Loading…
[Platform] Move get_punica_wrapper() function to Platform
#11516
opened Dec 26, 2024 by
shen-shanshan
•
Draft
[Hardware][AMD]: Replace HIPCC version with more precise ROCm version
ci/build
rocm
#11515
opened Dec 26, 2024 by
hj-wei
Loading…
[Misc] Use registry-based initialization for KV cache transfer connector.
#11481
opened Dec 25, 2024 by
KuntaiDu
Loading…
[Misc] Allow initializing KV cache transfer agent when using third-party library for disaggregated prefill
#11480
opened Dec 25, 2024 by
KuntaiDu
Loading…
Previous Next
ProTip!
Adding no:label will show everything without a label.