Skip to content

Commit

Permalink
Merge pull request #152 from nickaj/moreu2updates
Browse files Browse the repository at this point in the history
Rationalisation of Ultra2 Documentation.
  • Loading branch information
nickaj authored May 15, 2024
2 parents 1b6a139 + 52d1b4e commit 6bca58c
Show file tree
Hide file tree
Showing 6 changed files with 40 additions and 73 deletions.
2 changes: 1 addition & 1 deletion docs/services/cs2/run.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Introduction

The Cerebras CS-2 Wafer-scale cluster (WSC) uses the Ultra2 system which serves as a host, provides access to files, the SLURM batch system etc.
The Cerebras CS-2 Wafer-scale cluster (WSC) uses the Ultra2 system as a host system which provides login services, access to files, the SLURM batch system etc.

## Connecting to the cluster

Expand Down
8 changes: 7 additions & 1 deletion docs/services/ultra2/access.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,10 @@
# Ultra2 Large Memory System
# Overview

Ultra2 is a single logical CPU system based at EPCC. It is suitable for running jobs which require large volumes of non-distributed memory (as opposed to a cluster).

## Specifications

The system is a HPE SuperDome Flex containing 576 individual cores in a SMT-1 arrangement (1 thread per core). The system has 18TB of memory available to users. Home directories are network mounted from the EIDF e1000 Lustre filesystem, although some local NVMe storage is available for temporary file storage during runs.

## Getting Access

Expand Down
20 changes: 20 additions & 0 deletions docs/services/ultra2/connect.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Login

The hostname for SSH access to the system is `ultra2.eidf.ac.uk`

## Access credentials

To access Ultra2, you need to use two credentials: your SSH key pair protected by a passphrase **and** a Time-based one-time password (TOTP).

### SSH Key

You must upload the public part of your SSH key pair to the SAFE by following the [instructions from the SAFE documentation](https://epcced.github.io/safe-docs/safe-for-users/#how-to-add-an-ssh-public-key-to-your-account)

### Time-based one-time password (TOTP)

You must set up your TOTP token by following the [instructions from the SAFE documentation](https://epcced.github.io/safe-docs/safe-for-users/#how-to-turn-on-mfa-on-your-machine-account)

### SSH Login example

To login to Ultra2, you will need to use the SSH Key and TOTP token as noted above.
With the appropriate key loaded<br>`ssh <username>@ultra2.eidf.ac.uk` will then prompt you, roughly once per day, for your TOTP code.
6 changes: 3 additions & 3 deletions docs/services/ultra2/index.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Ultra2 Large Memory System

[Get Access](./access/)

[Running codes](./run/)
[Overview](./access/)
[Connect](./connect/)
[Running jobs](./run/)
74 changes: 7 additions & 67 deletions docs/services/ultra2/run.md
Original file line number Diff line number Diff line change
@@ -1,86 +1,26 @@
# Ultra2 High Memory System

## Introduction

Ultra2 is a single logical CPU system based at EPCC. It is suitable for running jobs which require large volumes of non-distributed memory (as opposed to a cluster).

## Specifications

The system is a HPE SuperDome Flex containing 576 individual cores in a SMT-1 arrangement (1 thread per core). The system has 18TB of memory available to users. Home directories are network mounted from the EIDF e1000 Lustre filesystem, although some local NVMe storage is available for temporary file storage during runs.

## Login

Login is via SSH only via `ssh <username>@sdf-cs1.epcc.ed.ac.uk`. See below for details on the credentials required to access the system.

### Access credentials

To access Ultra2, you need to use two credentials: your SSH key pair protected by a passphrase **and** a Time-based one-time password (TOTP).

### SSH Key Pairs

You will need to generate an SSH key pair protected by a passphrase to access Ultra2.

Using a terminal (the command line), set up a key pair that contains your e-mail address and enter a passphrase you will use to unlock the key:

```bash
$ ssh-keygen -t rsa -C "[email protected]"
...
-bash-4.1$ ssh-keygen -t rsa -C "[email protected]"
Generating public/private rsa key pair.
Enter file in which to save the key (/Home/user/.ssh/id_rsa): [Enter]
Enter passphrase (empty for no passphrase): [Passphrase]
Enter same passphrase again: [Passphrase]
Your identification has been saved in /Home/user/.ssh/id_rsa.
Your public key has been saved in /Home/user/.ssh/id_rsa.pub.
The key fingerprint is:
03:d4:c4:6d:58:0a:e2:4a:f8:73:9a:e8:e3:07:16:c8 [email protected]
The key's randomart image is:
+--[ RSA 2048]----+
| . ...+o++++. |
| . . . =o.. |
|+ . . .......o o |
|oE . . |
|o = . S |
|. +.+ . |
|. oo |
|. . |
| .. |
+-----------------+
```
(remember to replace "<[email protected]>" with your e-mail address).
### Upload public part of key pair to SAFE
You should now upload the public part of your SSH key pair to the SAFE by following the [instructions from the SAFE documentation](https://epcced.github.io/safe-docs/safe-for-users/#how-to-add-an-ssh-public-key-to-your-account)
### Time-based one-time password (TOTP)
Remember, you will need to use both an SSH key and Time-based one-time password to log into Ultra2 so you will also need to [set up your TOTP](https://epcced.github.io/safe-docs/safe-for-users/#how-to-turn-on-mfa-on-your-machine-account) before you can log into Ultra2.
### SSH Login
To login to the host system, you will need to use the SSH Key and TOTP token you registered when creating the account [SAFE](https://www.safe.epcc.ed.ac.uk), along with the SSH Key you registered when creating the account. For example, with the appropriate key loaded<br>`ssh <username>@sdf-cs1.epcc.ed.ac.uk` will then prompt you, roughly once per day, for your TOTP code.
# Running jobs

## Software

The primary software provided is Intel's OneAPI suite containing mpi compilers and runtimes, debuggers and the vTune performance analyser. Standard GNU compilers are also available.
### OneAPI

The primary HPC software provided is Intel's OneAPI suite containing mpi compilers and runtimes, debuggers and the vTune performance analyser. Standard GNU compilers are also available.
The OneAPI suite can be loaded by sourcing the shell script:

```bash
source /opt/intel/oneapi/setvars.sh
```

## Running Jobs
## Queue system

All jobs must be run via SLURM to avoid inconveniencing other users of the system. Users should not run jobs directly. Note that the system has one logical processor with a large number of threads and thus appears to SLURM as a single node. This is intentional.

## Queue limits
### Queue limits

We kindly request that users limit their maximum total running job size to 288 cores and 4TB of memory, whether that be a divided into a single job, or a number of jobs.
This may be enforced via SLURM in the future.

### MPI jobs
### Example MPI job

An example script to run a multi-process MPI "Hello world" example is shown.

Expand Down
3 changes: 2 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,8 @@ nav:
- "Get Access": services/cs2/access.md
- "Running codes": services/cs2/run.md
- "Ultra2":
- "Get Access": services/ultra2/access.md
- "Overview": services/ultra2/access.md
- "Connect": services/ultra2/connect.md
- "Running codes": services/ultra2/run.md
- "GPU Service":
- "Overview": services/gpuservice/index.md
Expand Down

0 comments on commit 6bca58c

Please sign in to comment.