-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enhancement(slurmctld): Make slurm respect memory constraints #46
enhancement(slurmctld): Make slurm respect memory constraints #46
Conversation
These changes remove the unused "cluster_name" from the relation data sent to slurmd on the slurmd relation. Additionally, make the "cluster_name" property private.
These changes add a peer relation for the slurmctld charm and replace using the slurmd interface to obtain the ingress_address with the new slurmctld-peer relation. The reason for this change is that we do not want to depend on the existence of the slurmd relation in order to know our ip. Using a peer relation we will always have resolvability so long as juju knows the ip address of the unit.
Slurm process tracking is currently not configured to kill child processes of a job. By default, set SignalChildrenProcesses=yes in cgroup.conf to enable this functionality.
These changes introduce default values for SelectTypeParameters and MemSpecLimit. `SelectTypeParameters=CR_CPU_Memory` to enable memory allocation enforcement. `MemSpecLimit=1024` to ensure systems always have 1G available memory that jobs cannot consume.
f2af1a2
to
7e5d358
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one small comment I'd like addressed, and then I'm good to merge this PR!
Co-authored-by: Jason Nucciarone <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There seems to be a rebase conflict preventing me rebasing and merging, so I'll squash and merge instead. |
These changes introduce default values for
SelectTypeParameters
andMemSpecLimit
.SelectTypeParameters=CR_CPU_Memory
to enable memory allocation enforcement.MemSpecLimit=1024
to ensure systems always have 1G available memory that jobs cannot consume.Fixes: #36