-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slurm scheduler statistics #45
Comments
One could also perhaps use these: collectd/collectd#1198 |
ansible-role-xdmod: CSCfi/ansible-role-xdmod#1 |
Using xdmod is much better than using collectd/grafana, as one can drill down into the statistics. Shall we keep this open until we have an xdmod that is installed once per site or CSC has a central one? |
I couldn't find any of the "sdiag" statistics in xdmod - stats from the scheduler. For example how long it takes for the backfill scheduler to run, how deep it looks. Are they in there somewhere? Can we add them? |
Let's try the collectd way. |
http://giovannitorres.me/graphing-sdiag-with-graphite.html
Grab slurm statistics with https://github.com/PySlurm/pyslurm
The statistics from sdiag can be quite useful - we should make a collectd script which outputs the slurm statistics.
The output from sinfo/squeue can also be of interest (to show pending nodes/cores/memory).
The text was updated successfully, but these errors were encountered: