From 25ba18d17cf9fcdb487eeaa354def17540f868d5 Mon Sep 17 00:00:00 2001 From: Nick Johnson Date: Wed, 15 May 2024 15:51:35 +0100 Subject: [PATCH 1/2] Rationalisation of Ultra2 Documentation. --- docs/services/cs2/run.md | 2 +- docs/services/ultra2/access.md | 8 +++- docs/services/ultra2/connect.md | 20 +++++++++ docs/services/ultra2/index.md | 6 +-- docs/services/ultra2/run.md | 74 ++++----------------------------- mkdocs.yml | 3 +- 6 files changed, 40 insertions(+), 73 deletions(-) create mode 100644 docs/services/ultra2/connect.md diff --git a/docs/services/cs2/run.md b/docs/services/cs2/run.md index 4c3b297fc..4da00a792 100644 --- a/docs/services/cs2/run.md +++ b/docs/services/cs2/run.md @@ -2,7 +2,7 @@ ## Introduction -The Cerebras CS-2 Wafer-scale cluster (WSC) uses the Ultra2 system which serves as a host, provides access to files, the SLURM batch system etc. +The Cerebras CS-2 Wafer-scale cluster (WSC) uses the Ultra2 system as a host system which login services, access to files, the SLURM batch system etc. ## Connecting to the cluster diff --git a/docs/services/ultra2/access.md b/docs/services/ultra2/access.md index ed36fd678..f454149b3 100644 --- a/docs/services/ultra2/access.md +++ b/docs/services/ultra2/access.md @@ -1,4 +1,10 @@ -# Ultra2 Large Memory System +# Overview + +Ultra2 is a single logical CPU system based at EPCC. It is suitable for running jobs which require large volumes of non-distributed memory (as opposed to a cluster). + +## Specifications + +The system is a HPE SuperDome Flex containing 576 individual cores in a SMT-1 arrangement (1 thread per core). The system has 18TB of memory available to users. Home directories are network mounted from the EIDF e1000 Lustre filesystem, although some local NVMe storage is available for temporary file storage during runs. ## Getting Access diff --git a/docs/services/ultra2/connect.md b/docs/services/ultra2/connect.md new file mode 100644 index 000000000..90791edc7 --- /dev/null +++ b/docs/services/ultra2/connect.md @@ -0,0 +1,20 @@ +# Login + +The hostname for SSH access to the system is `ultra2.eidf.ac.uk` + +## Access credentials + +To access Ultra2, you need to use two credentials: your SSH key pair protected by a passphrase **and** a Time-based one-time password (TOTP). + +### SSH Key + +You must upload the public part of your SSH key pair to the SAFE by following the [instructions from the SAFE documentation](https://epcced.github.io/safe-docs/safe-for-users/#how-to-add-an-ssh-public-key-to-your-account) + +### Time-based one-time password (TOTP) + +You must set up your TOTP token by following the [instructions from the SAFE documentation](https://epcced.github.io/safe-docs/safe-for-users/#how-to-turn-on-mfa-on-your-machine-account) + +### SSH Login example + +To login to Ultra2, you will need to use the SSH Key and TOTP token as noted above. +With the appropriate key loaded
`ssh @ultra2.eidf.ac.uk` will then prompt you, roughly once per day, for your TOTP code. diff --git a/docs/services/ultra2/index.md b/docs/services/ultra2/index.md index f4e649e0b..716250eca 100644 --- a/docs/services/ultra2/index.md +++ b/docs/services/ultra2/index.md @@ -1,5 +1,5 @@ # Ultra2 Large Memory System -[Get Access](./access/) - -[Running codes](./run/) +[Overview](./access/) +[Connect](./connect/) +[Running jobs](./run/) diff --git a/docs/services/ultra2/run.md b/docs/services/ultra2/run.md index 5788d35f9..b93370931 100644 --- a/docs/services/ultra2/run.md +++ b/docs/services/ultra2/run.md @@ -1,86 +1,26 @@ -# Ultra2 High Memory System - -## Introduction - -Ultra2 is a single logical CPU system based at EPCC. It is suitable for running jobs which require large volumes of non-distributed memory (as opposed to a cluster). - -## Specifications - -The system is a HPE SuperDome Flex containing 576 individual cores in a SMT-1 arrangement (1 thread per core). The system has 18TB of memory available to users. Home directories are network mounted from the EIDF e1000 Lustre filesystem, although some local NVMe storage is available for temporary file storage during runs. - -## Login - -Login is via SSH only via `ssh @sdf-cs1.epcc.ed.ac.uk`. See below for details on the credentials required to access the system. - -### Access credentials - -To access Ultra2, you need to use two credentials: your SSH key pair protected by a passphrase **and** a Time-based one-time password (TOTP). - -### SSH Key Pairs - -You will need to generate an SSH key pair protected by a passphrase to access Ultra2. - -Using a terminal (the command line), set up a key pair that contains your e-mail address and enter a passphrase you will use to unlock the key: - -```bash - $ ssh-keygen -t rsa -C "your@email.com" - ... - -bash-4.1$ ssh-keygen -t rsa -C "your@email.com" - Generating public/private rsa key pair. - Enter file in which to save the key (/Home/user/.ssh/id_rsa): [Enter] - Enter passphrase (empty for no passphrase): [Passphrase] - Enter same passphrase again: [Passphrase] - Your identification has been saved in /Home/user/.ssh/id_rsa. - Your public key has been saved in /Home/user/.ssh/id_rsa.pub. - The key fingerprint is: - 03:d4:c4:6d:58:0a:e2:4a:f8:73:9a:e8:e3:07:16:c8 your@email.com - The key's randomart image is: - +--[ RSA 2048]----+ - | . ...+o++++. | - | . . . =o.. | - |+ . . .......o o | - |oE . . | - |o = . S | - |. +.+ . | - |. oo | - |. . | - | .. | - +-----------------+ -``` - -(remember to replace "" with your e-mail address). - -### Upload public part of key pair to SAFE - -You should now upload the public part of your SSH key pair to the SAFE by following the [instructions from the SAFE documentation](https://epcced.github.io/safe-docs/safe-for-users/#how-to-add-an-ssh-public-key-to-your-account) - -### Time-based one-time password (TOTP) - -Remember, you will need to use both an SSH key and Time-based one-time password to log into Ultra2 so you will also need to [set up your TOTP](https://epcced.github.io/safe-docs/safe-for-users/#how-to-turn-on-mfa-on-your-machine-account) before you can log into Ultra2. - -### SSH Login - -To login to the host system, you will need to use the SSH Key and TOTP token you registered when creating the account [SAFE](https://www.safe.epcc.ed.ac.uk), along with the SSH Key you registered when creating the account. For example, with the appropriate key loaded
`ssh @sdf-cs1.epcc.ed.ac.uk` will then prompt you, roughly once per day, for your TOTP code. +# Running jobs ## Software -The primary software provided is Intel's OneAPI suite containing mpi compilers and runtimes, debuggers and the vTune performance analyser. Standard GNU compilers are also available. +### OneAPI + +The primary HPC software provided is Intel's OneAPI suite containing mpi compilers and runtimes, debuggers and the vTune performance analyser. Standard GNU compilers are also available. The OneAPI suite can be loaded by sourcing the shell script: ```bash source /opt/intel/oneapi/setvars.sh ``` -## Running Jobs +## Queue system All jobs must be run via SLURM to avoid inconveniencing other users of the system. Users should not run jobs directly. Note that the system has one logical processor with a large number of threads and thus appears to SLURM as a single node. This is intentional. -## Queue limits +### Queue limits We kindly request that users limit their maximum total running job size to 288 cores and 4TB of memory, whether that be a divided into a single job, or a number of jobs. This may be enforced via SLURM in the future. -### MPI jobs +### Example MPI job An example script to run a multi-process MPI "Hello world" example is shown. diff --git a/mkdocs.yml b/mkdocs.yml index 1ab6c25df..d4788b3a8 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -58,7 +58,8 @@ nav: - "Get Access": services/cs2/access.md - "Running codes": services/cs2/run.md - "Ultra2": - - "Get Access": services/ultra2/access.md + - "Overview": services/ultra2/access.md + - "Connect": services/ultra2/connect.md - "Running codes": services/ultra2/run.md - "GPU Service": - "Overview": services/gpuservice/index.md From 52d1b4edc30a101ebb8ef1a1fa300f92018efb93 Mon Sep 17 00:00:00 2001 From: Nick Johnson Date: Wed, 15 May 2024 16:26:03 +0100 Subject: [PATCH 2/2] Update run.md typo --- docs/services/cs2/run.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/services/cs2/run.md b/docs/services/cs2/run.md index 4da00a792..cf7f280ee 100644 --- a/docs/services/cs2/run.md +++ b/docs/services/cs2/run.md @@ -2,7 +2,7 @@ ## Introduction -The Cerebras CS-2 Wafer-scale cluster (WSC) uses the Ultra2 system as a host system which login services, access to files, the SLURM batch system etc. +The Cerebras CS-2 Wafer-scale cluster (WSC) uses the Ultra2 system as a host system which provides login services, access to files, the SLURM batch system etc. ## Connecting to the cluster