Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear Metrics list #2887

Open
2 of 4 tasks
vitalyshalumov opened this issue Jan 7, 2025 · 1 comment
Open
2 of 4 tasks

Unclear Metrics list #2887

vitalyshalumov opened this issue Jan 7, 2025 · 1 comment

Comments

@vitalyshalumov
Copy link

System Info

Latest docker

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

Run tgi and monitor with graphana

Expected behavior

Can you please provide a more thorough explanation regarding the exported metrics. For examle which metric is the time to first token?

@drbh
Copy link
Collaborator

drbh commented Jan 8, 2025

Hi @vitalyshalumov thank you for opening this issue. Please see the Monitoring TGI server with Prometheus and Grafana dashboard docs, the metrics reference docs and the original pr that adds the /metrics route. These resources should provide details about each metric

Regarding time to first token, the premade graphana dashboard has a chart/query for this metric here https://github.com/huggingface/text-generation-inference/blob/main/assets/tgi_grafana.json, and can be calculated from the prefill time.

I hope this is helpful. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants