We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Latest docker
Run tgi and monitor with graphana
Can you please provide a more thorough explanation regarding the exported metrics. For examle which metric is the time to first token?
The text was updated successfully, but these errors were encountered:
Hi @vitalyshalumov thank you for opening this issue. Please see the Monitoring TGI server with Prometheus and Grafana dashboard docs, the metrics reference docs and the original pr that adds the /metrics route. These resources should provide details about each metric
/metrics
Regarding time to first token, the premade graphana dashboard has a chart/query for this metric here https://github.com/huggingface/text-generation-inference/blob/main/assets/tgi_grafana.json, and can be calculated from the prefill time.
prefill
I hope this is helpful. Thank you!
Sorry, something went wrong.
No branches or pull requests
System Info
Latest docker
Information
Tasks
Reproduction
Run tgi and monitor with graphana
Expected behavior
Can you please provide a more thorough explanation regarding the exported metrics. For examle which metric is the time to first token?
The text was updated successfully, but these errors were encountered: