Unclear Metrics list #2887

vitalyshalumov · 2025-01-07T14:58:13Z

Latest docker

Run tgi and monitor with graphana

Can you please provide a more thorough explanation regarding the exported metrics. For examle which metric is the time to first token?

drbh · 2025-01-08T15:11:37Z

Hi @vitalyshalumov thank you for opening this issue. Please see the Monitoring TGI server with Prometheus and Grafana dashboard docs, the metrics reference docs and the original pr that adds the /metrics route. These resources should provide details about each metric

Regarding time to first token, the premade graphana dashboard has a chart/query for this metric here https://github.com/huggingface/text-generation-inference/blob/main/assets/tgi_grafana.json, and can be calculated from the prefill time.

I hope this is helpful. Thank you!

Provide feedback