Skip to content

Pull requests: vllm-project/llm-compressor

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Use 1 GPU for offloading examples
#979 opened Dec 15, 2024 by dsikka Loading…
Fix SmoothQuant offload bug
#978 opened Dec 15, 2024 by dsikka Draft
fix conseq onehsot
#971 opened Dec 11, 2024 by horheynm Draft
update test_run_compressed
#970 opened Dec 11, 2024 by horheynm Draft
Bitmask test
#956 opened Dec 5, 2024 by rahul-tuli Draft
Replace tokenizer with processor
#955 opened Dec 5, 2024 by kylesayrs Loading…
Dataset split fallbacks
#953 opened Dec 4, 2024 by kylesayrs Loading…
Add int8 discussion section in readme
#944 opened Nov 29, 2024 by kylesayrs Loading…
Vision Datasets
#943 opened Nov 28, 2024 by kylesayrs Draft
Remove uses of get_observer
#939 opened Nov 27, 2024 by kylesayrs Loading…
Add recipe check vllm e2e
#929 opened Nov 21, 2024 by horheynm Loading…
Allow Shortcutting Min-max Observer
#887 opened Nov 1, 2024 by kylesayrs Loading…
FSDP utils cleanup
#854 opened Oct 19, 2024 by kylesayrs Loading…
Awq re implementation
#824 opened Oct 7, 2024 by rahul-tuli Draft
KV Cache, E2E Tests
#742 opened Oct 1, 2024 by horheynm Loading…
Add: Weight clipping to AWQModifier
#184 opened Sep 18, 2024 by rahul-tuli Loading…
1 task
ProTip! Type g i on any issue or pull request to go back to the issue listing page.