Skip to content

Latest commit

 

History

History
198 lines (166 loc) · 63.4 KB

migration-of-mpi-apps-to-slurm-22.05.7.md

File metadata and controls

198 lines (166 loc) · 63.4 KB

Migration of MPI Apps to Slurm 22.05.7

In January 2023, Oscar will be migrating to use Slurm version 22.05.7.

{% hint style="info" %} Slurm version 22.05.7

  • improves security and speed,
  • supports boths PMI2 and PMIX, and
  • provides REST APIs
  • allows users to prioritize their jobs via scontrol top <job_id> {% endhint %}

While most applications will be unaffected by these changes, applications built to make use of MPI may need to be rebuilt to work properly. To help facilitate this, we are providing users who use MPI-based applications (either through Oscar's module system or built by users) with advanced access to a test cluster running the new version of Slurm. Instructions for accessing the test cluster, building MPI-based applications, and submitting MPI jobs using the new Slurm, are provided below.

Please note - some existing modules of MPI-based applications will be deprecated and removed from the system as part of this upgrade. A list of modules that will no longer be available to users following the upgrade is given at the bottom of the page.

Instructions for Testing Applications with Slurm 22.05.7

  1. Request access to the Slurm 22.05.7 test cluster (email [email protected])
  2. Connect to Oscar via either SSH or Open OnDemand (instructions below)
  3. Build your application using the new MPI applications listed below
  4. Submit your job

{% hint style="danger" %} Users must contact [email protected] to obtain access to the test cluster in order to submit jobs using Slurm 22.05.7. {% endhint %}

Connecting via SSH

  1. Connect to Oscar using the ssh command in a terminal window
  2. From Oscar's command line, connect to the test cluster using the command ssh node1947
  3. From the node1947 command line, submit your jobs (either interactive or batch) as follows:

{% tabs %} {% tab title="Interactive job" %}

  • For CPU-only jobs: interact -q image-test
  • For GPU jobs: interact -q gpu {% endtab %}

{% tab title="Batch job" %} Include the following line within your batch script and then submit using the sbatch command, as usual

  • For CPU-only jobs: #SBATCH -p image-test
  • For GPU jobs: #SBATCH -p gpu {% endtab %} {% endtabs %}

Connecting via Open OnDemand

  1. Open a web browser and connect to poodcit2.services.brown.edu
  2. Login with your Oscar username and password
  3. Start a session using the Advanced Desktop App
  4. Select the gpu partition and click the launch button.

{% hint style="info" %}

  • Only the Advanced Desktop App will connect to the test cluster
  • The Advanced Desktop App must connect to the gpu partition {% endhint %}

MPI Applications

Migrated or New Modules

{% hint style="info" %} If the "Current Module Version" for an application is blank, a new version is built for the application. {% endhint %}

Application Current Module Version Migrated or New Module Version
abaqus
  • 2021.1_intel17
  • 2021_slurm22_a
ambertools
  • amber22
boost
  • 1.69
  • 1.69_openmpi_4.0.7_gcc_10.2_slurm22
CharMM
  • CharMM/c47b1_slurm20
  • CharMM/c47b1
cp2k
  • 2022.2
dedalus
  • 2.1905
  • 2.1905_openmpi_4.05_gcc_10.2_slurm20
  • 2.1905_openmpi_4.0.7_gcc_10.2_slurm22
esmf
  • 8.4.0b12
  • 8.4.0_openmpi_4.0.7_gcc_10.2_slurm22
fftw
  • 3.3.6
  • 3.3.8
  • 3.3.6_openmpi_4.0.7_gcc_10.2_slurm22
  • 3.3.10_slurm22
global_arrays
  • 5.8_openmpi_4.0.5_gcc_10.2_slurm20
  • 5.8_openmpi_4.0.7_gcc_10.2_slurm22
gpaw
  • 21.1.0_hpcx_2.7.0_gcc_10.2_slurm20
  • 21.1.0_openmpi_4.0.5_gcc_10.2_slurm20
  • 21.1.0a_openmpi_4.0.5_gcc_10.2_slurm20
  • 21.1.0_openmpi_4.0.7_gcc_10.2_slurm22
  • 21.1.0_openmpi_4.0.7_gcc_10.2_slurm22
  • 21.1.0_openmpi_4.0.7_gcc_10.2_slurm22
gromacs
  • 2018.2
  • gromacs/2018.2_mvapich2-2.3.5_gcc_10.2_slurm22
hdf5
  • 1.10.8_mvapich2_2.3.5_gcc_10.2_slurm22
  • 1.10.8_openmpi_4.0.7_gcc_10.2_slurm22
  • 1.10.8_openmpi_4.0.7_intel_2020.2_slurm22
  • 1.12.2_openmpi_4.0.7_intel_2020.2_slurm22
ior
  • 3.3.0
lammps
  • 29Sep21_openmpi_4.0.5_gcc_10.2_slurm20
  • 29Sep21_openmpi_4.0.7_gcc_10.2_slurm22
meme
  • 5.3.0
  • 5.3.0_slurm22
Molpro
  • 2021.3.1
  • 2021.3.1_openmpi_4.0.7_gcc_10.2_slurm22
mpi
  • hpcx_2.7.0_gcc_10.2_slurm20
  • mvapich2-2.3.5_gcc_10.2_slurm20
  • hpcx_2.7.0_gcc_10.2_slurm22
  • mvapich2-2.3.5_gcc_10.2_slurm22
  • openmpi_4.0.7_gcc_10.2_slurm22
  • openmpi_4.0.7_intel_2020.2_slurm22
mpi4py
  • 3.1.4_py3.9.0_slurm22
netcdf
  • 4.7.4_gcc_10.2_hdf5_1.10.5
  • 4.7.4_intel_2020.2_hdf5_1.12.0
  • 4.7.4_gcc_10.2_hdf5_1.10.8_slurm22
  • 4.7.4_gcc_10.2_hdf5_1.12.2_slurm22
netcdf4-python
  • 1.6.2
osu-mpi
  • 5.6.3_openmpi_4.0.7_gcc_10.2
petsc
  • petsc/3.18.2_openmpi_4.0.7_gcc_10.2_slurm22
pnetcdf
  • 1.12.3
  • 1.12.3_openmpi_4.0.7_gcc_10.2_slurm22
qmcpack
  • 3.9.2_hpcx_2.7.0_gcc_10.2_slurm20
  • 3.9.2_openmpi_4.0.0_gcc_8.3_slurm20
  • 3.9.2_openmpi_4.0.0_gcc_8.3_slurm20_complex
  • 3.9.2_openmpi_4.0.1_gcc
  • 3.9.2_openmpi_4.0.4_gcc
  • 3.9.2_openmpi_4.0.5_intel_2020.2_slurm20
  • 3.9.2_openmpi_4.0.7_gcc_10.2_slurm22
quantumespresso
  • 6.4_openmpi_4.0.0_gcc_8.3_slurm20
  • 6.4_openmpi_4.0.5_intel_2020.2_slurm20
  • 7.0_openmpi_4.0.5_intel_2020.2_slurm20
  • 6.4_openmpi_4.0.7_gcc_10.2_slurm22
  • 6.4_openmpi_4.0.7_intel_2020.2_slurm22
  • 7.0_openmpi_4.0.7_gcc_10.2_slurm22
vasp
  • 5.4.1
  • 5.4.1_mvapich2-2.3.5_intel_2020.2_slurm20
  • 5.4.4
  • 5.4.4_intel
  • 5.4.4_mvapich2-2.3.5_intel_2020.2_slurm20
  • 5.4.4_openmpi_4.0.5_gcc_10.2_slurm20
  • 5.4.4a
  • 6.1.1_ompi405_yqi27
  • 6.1.1_openmpi_4.0.5_intel_2020.2_yqi27_slurm20
  • 6.1.1_yqi27
  • 6.3.0_cfgoldsm
  • 6.3.2_avandewa
  • 5.4.1_slurm22
  • 5.4.4_slurm22
  • 5.4.4_openmpi_4.0.7_gcc_10.2_slurm22
  • 6.1.1_ompi407_yqi27_slurm22
  • 6.3.0_cfgoldsm_slurm22
  • 6.3.2_avandewa_slurm22
wrf
  • 4.2.1_hpcx_2.7.0_intel_2020.2_slurm20

To build custom applications:

We recommend using following MPI modules to build your custom applications:

MPIOscar Module
GCC based OpenMPImpi/openmpi_4.0.7_gcc_10.2_slurm22
Intel based OpenMPImpi/openmpi_4.0.7_intel_2020.2_slurm22
MVAPICHmpi/mvapich2-2.3.5_gcc_10.2_slurm22
Mellanox HPC-Xmpi/hpcx_2.7.0_gcc_10.2_slurm22

{% tabs %} {% tab title="GNU Configure Example" %} module load mpi/openmpi_4.0.7_gcc_10.2_slurm22

module load gcc/10.2 cuda/11.7.1

CC=mpicc CXX=mpicxx ./configure --prefix=/path/to/install/dir {% endtab %}

{% tab title="CMAKE Configure Example" %} module load mpi/openmpi_4.0.7_gcc_10.2_slurm22

module load gcc/10.2 cuda/11.7.1

cmake -DCMAKE_C_COMPILER=mpicc DCMAKE_CXX_COMPILER=mpicxx .. {% endtab %} {% endtabs %}

Deprecated Modules

{% hint style="info" %} A new module might be available for a deprecated application module. Please search the table above to check if a new module is available for an application. {% endhint %}

Application Deprecated Module
abaqus
  • 2017
  • 2021
  • 2021.1
  • 6.12sp2
abinit
  • 9.6.2
abyss
  • 2.1.1
ambertools
  • amber16
  • amber16-gpu
  • amber17
  • amber17_lic
  • amber21
bagel
  • 1.2.2
boost
  • 1.55
  • 1.57
  • 1.68
  • 1.44.0
  • 1.62.0-intel
  • 1.63.0
  • 1.75.0_openmpi_4.0.5_intel_2020.2_slurm20
  • 1.76.0_hpcx_2.7.0_gcc_10.2_slurm20
  • 1.76.0_hpcx_2.7.0_intel_2020.2_slurm20
cabana
  • 1
  • 1.1
  • 1.1_hpcx_2.7.0_gcc_10.2_slurm20
campari
  • 3.0
cesm
  • 1.2.1
  • 1.2.2
  • 2.1.1
cp2k
  • 7.1
  • 7.1_mpi
  • 8.1.0
  • 9.1.0
dacapo
  • 2.7.16_mvapich2_intel
dalton
  • 2018
  • 2018.0_mvapich2-2.3.5_intel_2020.2_slurm20
dice
  • 1
esmf
  • 7.1.0r
  • 8.0.0
  • 8.0.0b
  • 8.1.0b11
  • 8.1.9b17
  • 8.3.0
  • 8.3.1b05
fenics
  • 2017.1
  • 2018.1.0
ffte
  • 6.0
  • 6.0/mpi
fftw
  • 2.1.5
  • 2.1.5_slurm2020
  • 2.1.5-double
  • 3.3.8a
gerris
  • 1
global_arrays
  • 5.6.1
  • 5.6.1_i8
  • 5.6.1_openmpi_2.0.3
gpaw
  • 1.2.0
  • 1.2.0_hpcx_2.7.0_gcc
  • 1.2.0_mvapich2-2.3a_gcc
  • 20.10_hpcx_2.7.0_intel_2020.2_slurm20
  • 20.10.0_hpcx_2.7.0_intel_2020.2_slurm20
gromacs
  • 2016.6
  • 2020.1
  • 2018.2_gpu
  • 2018.2_hpcx_2.7.0_gcc_10.2_slurm20
  • 2020.1_hpcx_2.7.0_gcc_10.2_slurm20
  • 2020.4_gpu
  • 2020.4_gpu_hpcx_2.7.0_gcc_10.2_slurm20
  • 2020.4_hpcx_2.7.0_gcc_10.2_slurm20
  • 2020.6_plumed
  • 2021.5_plumed
hande
  • 1.1.1
  • 1.1.1_64
  • 1.1.1_debug
hdf5
  • 1.10.0
  • 1.10.1_parallel
  • 1.10.5
  • 1.10.5_fortran
  • 1.10.5_mvapich2-2.3.5_intel_2020.2_slurm20
  • 1.10.5_openmpi_3.1.3_gcc
  • 1.10.5_openmpi_3.1.6_gcc
  • 1.10.5_openmpi_4.0.0_gcc
  • 1.10.5_openmpi_4.0.5_gcc_10.2_slurm20
  • 1.10.5_parallel
  • 1.10.7_hpcx_2.7.0_intel_2020.2_slurm20
  • 1.10.7_openmpi_4.0.5_gcc_10.2_slurm20
  • 1.10.7_openmpi_4.0.5_intel_2020.2_slurm20
  • 1.12.0_hpcx_2.7.0_intel_2020.2
  • 1.12.0_hpcx_2.7.0_intel_2020.2_slurm20
  • 1.12.0_openmpi_4.0.5_intel_2020.2_slurm20
hnn
  • 1.0
hoomd
  • 2.9.0
horovod
  • 0.19.5
ior
  • 3.0.1
  • 3.3.0
lammps
  • 17-Nov-16
  • 11-Aug-17
  • 16-Mar-18
  • 22-Aug-18
  • 7-Aug-19
  • 11Aug17_serial
  • 29Oct20_hpcx_2.7.0_intel_2020.2
  • 29Oct20_openmpi_4.0.5_gcc_10.2_slurm20
medea
  • 3.2.3.0
meme
  • 5.0.5
meshlab
  • 20190129_qt59
Molpro
  • 2019.2
  • 2020.1
  • 2012.1.15
  • 2015_gcc
  • 2015_serial
  • 2018.2_ga
  • 2019.2_ga
  • 2020.1_ga
  • 2020.1_openmpi_4.0.5_gcc_10.2_slurm20
  • 2021.3.1_openmpi_4.0.5_gcc_10.2_slurm20
mpi4py
  • 3.0.1_py3.6.8
multinest
  • 3.1
n2p2
  • 1.0.0
  • 2.0.0
  • 2.0.0_hpcx
namd
  • 2.11-multicore
  • 2.13b1-multicore
netcdf
  • 3.6.3
  • 4.4.1.1_gcc
  • 4.4.1.1_intel
  • 4.7.0_intel2019.3
  • 4.7.4_gcc8.3
nwchem
  • 7
  • 6.8-openmpi
  • 7.0.2_mvapich2-2.3.5_intel_2020.2_slurm20
  • 7.0.2_openmpi_4.0.5_intel_2020.2_slurm20
  • 7.0.2_openmpi_4.1.1_gcc_10.2_slurm20
openfoam
  • 4.1
  • 7
  • 4.1-openmpi_3.1.6_gcc_10.2_slurm20
  • 4.1a
  • 7.0_hpcx_2.7.0_gcc_10.2_slurm20
openmpi
  • openmpi_4.0.5_gcc_10.2_slurm20
Openmpi wth Intel compilers
  • openmpi_4.0.5_intel_2020.2_slurm20
orca
  • 4.0.1.2
  • 4.1.1
  • 4.2.1
  • 5.0.0
  • 5.0.1
osu-mpi
  • 5.3.2
paraview
  • 5.1.0
  • 5.1.0_yurt
  • 5.4.1
  • 5.6.0_no_scalable
  • 5.6.0_yurt
  • 5.8.0
  • 5.8.0_mesa
  • 5.8.0_release
  • 5.8.1_openmpi_4.0.5_intel_2020.2_slurm20
  • 5.9.0
  • 5.9.0_ui
paris
  • 1.1.3
petsc
  • 3.14.2_hpcx_2.7.0_intel_2020.2_slurm20
  • 3.14.2_mpich3.3a3_intel_2020.2
  • 3.7.5
  • 3.7.7
  • 3.8.3
phyldog
  • 1.0
plumed
  • 2.7.2
  • 2.7.5
pmclib
  • 1.1
polychord
  • 1
  • 2
polyrate
  • 17C
potfit
  • 20201014
  • 0.7.1
prophet
  • augustegm_1.2
pstokes
  • 1.0
pymultinest
  • 2.9
qchem
  • 5.0.2
  • 5.0.2-openmpi
qmcpack
  • 3.10.0_hpcx_2.7.0_intel_2020.2_slurm20
  • 3.10.0_openmpi_4.0.5_intel_2020.2_slurm20
  • 3.7.0
  • 3.9.1
  • 3.9.1_openmpi_3.1.6
quantumespresso
  • 6.1
  • 6.4
  • 6.5
  • 6.6
  • 6.4_hpcx_2.7.0_intel_2020.02_slurm20
  • 6.4_hpcx_2.7.0_intel_2020.2_slurm20
  • 6.4_openmpi_4.0.5_intel_slurm20
  • 6.4.1
  • 6.5_openmpi_4.0.5_intel_slurm20
  • 6.6_openmpi_4.0.5_intel_2020.2_slurm20
  • 6.7_openmpi_4.0.5_intel_2020.2_slurm20
relion
  • 3.1.3
rotd
  • 2014-11-15_mvapich2
scalasca
  • 2.3.1_intel
scorep
  • 3.0_intel_mvapich2
siesta
  • 3.2
  • 4.1
sprng
  • 5
su2
  • 7.0.2
trilinos
  • 12.12.1
vtk
  • 7.1.1
  • 8.1.0
wrf
  • 3.6.1
  • 4.2.1_hpcx_2.7.0_intel_2020.2_slurm20