-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Pull requests: huggingface/text-generation-inference
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
docs(conceptual/speculation): available links Train Medusa
#2863
opened Dec 23, 2024 by
guspan-tanadi
Loading…
1 of 5 tasks
Update Dockerfile to use devel image for compatibility
#2848
opened Dec 16, 2024 by
YaserJaradeh
Loading…
2 of 5 tasks
Add possible variants for A100 and H100 GPUs for auto-detecting flops
#2837
opened Dec 13, 2024 by
lazariv
Loading…
5 tasks
bitsandbytes: upgrade and enable CUDA Graphs for 4bit by default
#2834
opened Dec 12, 2024 by
matthewdouglas
Loading…
4 of 5 tasks
Enable FP8 Per-Tensor Scales and Integrate Marlin/MoE Kernels Repo for ROCm
#2825
opened Dec 11, 2024 by
mht-sharma
Loading…
5 tasks
Flash decoding kernel adding and prefill-chunking and prefix caching enabling in intel cpu/xpu
#2815
opened Dec 10, 2024 by
sywangyi
Loading…
5 tasks
feat: tokenize each request individually and increase warmup image size
#2802
opened Dec 5, 2024 by
drbh
Loading…
Install
text-generation-server
from poetry.lock
export
#2786
opened Nov 29, 2024 by
alvarobartt
Loading…
1 of 5 tasks
Get opentelemetry trace id from request headers instead of creating a new trace
#2648
opened Oct 15, 2024 by
kozistr
Loading…
3 of 5 tasks
[DOCS] Add Google Cloud TGI integration via dedicated DLCs
#2612
opened Oct 5, 2024 by
alvarobartt
•
Draft
1 of 5 tasks
ProTip!
Follow long discussions with comments:>50.