Skip to content

Commit

Permalink
Cleanup2: remove support for FGCI packaged slurm
Browse files Browse the repository at this point in the history
  • Loading branch information
VilleS1 committed Sep 30, 2021
1 parent e63f82f commit 5b27341
Show file tree
Hide file tree
Showing 13 changed files with 9 additions and 132 deletions.
23 changes: 1 addition & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,6 @@ ansible-role-slurm

Tested with these Linux distributions:
- CentOS 7
- 17.02.x (travis ci automatic testing)
- 17.11.x (travis ci automatic testing)
- Ubuntu
- 18.04 (client only)

Expand Down Expand Up @@ -47,13 +45,6 @@ It is possible to run the slurmdbd on a different host than the slurmctld by cha

It is also possible to setup a backup slurm controller by defining slurm_backup_controller variable. Please read the [SLURM HA documentation](https://slurm.schedmd.com/quickstart_admin.html#HA). For example you'll need a shared directory (for example NFS) available on both the slurm_service_node and slurm_backup_controller.

Specific versions of SLURM can be gotten from the FGCI yum repo by setting:
<pre>
fgci_slurmrepo_version: "fgcislurm1711"
</pre>

We have 1702 and 1711 RPMs there.

### Implementation

A playbook that uses this role: https://github.com/fgci-org/fgci-ansible
Expand All @@ -78,28 +69,16 @@ Example Playbook

### Known Issues

- This role used to be able to build slurm rpms, distribute them and install them. The last tag/release that had this feature was v1.5.0
- Setting up a shared directory á la NFS for running a SLURM in HA is out of scope for this role. There are many [NFS server roles](https://github.com/CSCfi/ansible-role-nfs) and [Mount Filesystem roles](https://github.com/CSCfi/ansible-role-nfs_mount) roles out there.

### Testing and contributions

Testing is done with [Travis](.travis.yml). New SLURM release can be tested after the RPMs are built and available in the FGCI repo. After that one needs to add a new tests/test1702.yml and a new IMAGE_BUILD_PLATFORM env in .travis.yml.
Testing is done with [Travis](.travis.yml).

- PRs to master
- if possible make sure that the new feature is also tested
- strive for backwards compatibility

**Adding testing of a new SLURM release**

Using 17.11 as an example

- Get CSC to build new rpms and put them in a new yum repo
- New branch in ansible-role-slurm with the following changes/additions:
- IMAGE_BUILD_PLATFORM=fgcislurm1711 in .travis.yml env:
- tests/test1711.yml with fgci_slurmrepo_version: "fgcislurm1711"
- tests/fgcislurm1711 directory symlink to tests/epel-centos7
- Then make changes if needed to the role that does not break older SLURM versions

# Authors / Contributors:

- Marco Passerini (original author)
Expand Down
4 changes: 4 additions & 0 deletions UPGRADE.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
Switch from FGCI to OHPC slurm packages
---------------------------------------

NOTE 2021-09-30:
Changes has been made to slurm role after writing this doc.
It may not be relevant anymore. Now by default the slurm is coming from ohpc.

In general one needs to be careful with slurmdbd and run it in
the foreground during upgrade to monitor progress. See
http://slurm.schedmd.com/quickstart_admin.html#upgrade
Expand Down
3 changes: 1 addition & 2 deletions defaults/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,7 @@ nis_server: False
#slurm_user_uid: 5004
#slurm_user_gid: 5004

fgci_slurmrepo_version: "fgcislurm1711"
slurm_repo: "fgci" # Or "ohpc" to use OHPC slurm packages
slurm_repo: "ohpc" # Or "ohpc" to use OHPC slurm packages
slurm_ohpc_versionlock: True
siteName: "io"
nodeBase: "{{ siteName }}"
Expand Down
8 changes: 0 additions & 8 deletions tasks/common_ubuntu.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,4 @@
---
# This sh/could be put in a separate role..
# define where we get .debs for slurm.
- name: Add local apt-repo
template: src=apt.repo.j2 dest=/etc/apt/sources.list.d/fgislurm.list owner=root group=root mode=0644 backup=yes
when: slurm_repo == 'fgci' and slurm_apt_repo == True


##
# Set slurm user and group locally on every host if uid/gid given
- name: add slurm unix group
group: name=slurm system=no state=present gid={{ slurm_user_gid|default(slurm_user_uid) }}
Expand Down
23 changes: 0 additions & 23 deletions tasks/version.yml
Original file line number Diff line number Diff line change
@@ -1,25 +1,2 @@
---

#### Version

- name: Get version of installed slurm RPM
shell: yum list installed slurm | grep slurm | awk '{print $2}' | cut -d'-' -f1
register: reg_slurm_yum_version
check_mode: no
changed_when: False

- name: Get version of installed slurm RPM major version
shell: yum list installed slurm | grep slurm | awk '{print $2}' | cut -d'-' -f1|cut -d "." -f1-2|sed -e 's/\.//'
register: reg_slurm_yum_version_major
check_mode: no
changed_when: False

- name: Set fact with contents of fgci_slurmrepo_version with only the numbers
set_fact: slurm_fact_fgci_slurmrepo_version="{{ fgci_slurmrepo_version | replace('fgcislurm', '')}}"

- name: print custom facts in verbose mode
debug: var=item verbosity=1
with_items:
- "{{ reg_slurm_yum_version['stdout'] }}"
- "{{ slurm_fact_fgci_slurmrepo_version }}"
- "{{ reg_slurm_yum_version_major['stdout'] }}"
23 changes: 0 additions & 23 deletions tasks/version_ubuntu.yml
Original file line number Diff line number Diff line change
@@ -1,25 +1,2 @@
---

#### Version

- name: Get version of installed slurm DEB
shell: dpkg -l slurm|grep "^ii"|awk '{print $3}'|cut -d'-' -f1
register: reg_slurm_yum_version
check_mode: no
changed_when: False

- name: Get version of installed slurm DEB major version
shell: dpkg -l slurm|grep "^ii"|awk '{print $3}'|cut -d'-' -f1|cut -d "." -f1-2|sed -e 's/\.//'
register: reg_slurm_yum_version_major
check_mode: no
changed_when: False

- name: Set fact with contents of fgci_slurmrepo_version with only the numbers
set_fact: slurm_fact_fgci_slurmrepo_version="{{ fgci_slurmrepo_version | replace('fgcislurm', '')}}"

- name: print custom facts in verbose mode
debug: var=item verbosity=1
with_items:
- "{{ reg_slurm_yum_version['stdout'] }}"
- "{{ slurm_fact_fgci_slurmrepo_version }}"
- "{{ reg_slurm_yum_version_major['stdout'] }}"
6 changes: 0 additions & 6 deletions templates/fgislurm.repo

This file was deleted.

1 change: 0 additions & 1 deletion tests/fgcislurm1702

This file was deleted.

1 change: 0 additions & 1 deletion tests/fgcislurm1711

This file was deleted.

14 changes: 3 additions & 11 deletions tests/test-in-docker-image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,17 +9,9 @@ OS_TYPE=${1:-}
OS_VERSION=${2:-}
ANSIBLE_VERSION=${3:-}

# So if we get fgcislurm as the first bash argument to this script we
# change playbook to a slurm specific version.
# This means to test a new SLURM version we need to add a new playbook.
if [[ $OS_TYPE = *"fgcislurm"* ]]; then
ANSIBLE_VAR=""
SLURMVERSION=$(echo $OS_TYPE|tr -d 'fgcislurm')
ANSIBLE_PLAYBOOk="tests/test$SLURMVERSION.yml"
else
ANSIBLE_VAR=""
ANSIBLE_PLAYBOOk="tests/test.yml"
fi
ANSIBLE_VAR=""
ANSIBLE_PLAYBOOk="tests/test.yml"

ANSIBLE_INVENTORY="tests/inventory"
#ANSIBLE_LOG_LEVEL=""
ANSIBLE_LOG_LEVEL="-v"
Expand Down
1 change: 0 additions & 1 deletion tests/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@
- { match: "{gpu[2-22]}", name: "check_hw_ib", arguments: "56" }
- { match: "*", name: "check_hw_eth", arguments: "eth0" }
- slurm_plugstack: True
- fgci_slurmrepo_version: "fgcislurm1711"
- slurm_x11_spank: True
- slurm_topology_plugin: "topology/tree"
- slurm_topologylist:
Expand Down
17 changes: 0 additions & 17 deletions tests/test1702.yml

This file was deleted.

17 changes: 0 additions & 17 deletions tests/test1711.yml

This file was deleted.

0 comments on commit 5b27341

Please sign in to comment.