An unorthodox approach (poc) of working around the following issue:
Singularity, cgroup memory.limits, mmaped strangeness
The idea is to try and outrace the situation observed here by preemptively terminating the slurm-singularity job which is about to reach the memory limit and trigger the stalling situation described in the issue above.
This is a kernel module which attaches a kprobe to mem_cgroup_oom_synchronize
function.
Upon triggering, in pre_handler , we check some preconditions the triggering process must meet:
- it is a child of slurmstepd
- it is a child of singularity
If both are true, the probe does:
- SLURM NOTIFICATION: fires an event to its ancestors slurmstepd's eventfd to notify slurm there is an "oom" about to happen in that cgroup
- JOB TERMINATION: terminates the slurmstepd child which is an ancestor of the triggering process with SIGKILL ("oom")
make
- sparse output...
insmod kp_oom.ko
- with debugging enabled
insmod kp_oom.ko dyndbg=+pmfl
CentOS Linux release 7.9.2009 (Core)
3.10.0-1127.19.1.el7.x86_64 x86_64 GNU/Linux
singularity version 3.6.4-1.el7
salloc --reservation=pj srun singularity run /groups/it/pja/sing/centos.img mempoc 2 16
sbatch --reservation=pj --wrap='singularity run /groups/it/pja/sing/centos.img mempoc 2 16'
srun --reservation=pj --pty singularity run /groups/it/pja/sing/centos.img mempoc 2 16;
salloc --reservation=pj srun /groups/it/pja/sing/mempoc 2 16
sbatch --reservation=pj --wrap='/groups/it/pja/sing/mempoc 2 16'
srun --reservation=pj --pty /groups/it/pja/sing/mempoc 2 16;