Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add how-to for connecting COS to Slurm (workload manager) #7

Merged
merged 6 commits into from
Aug 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 5 additions & 7 deletions custom_conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,15 +13,13 @@
ogp_site_url = "https://canonical-starter-pack.readthedocs-hosted.com/"
ogp_site_name = project
ogp_image = "https://assets.ubuntu.com/v1/253da317-image-document-ubuntudocs.svg"
html_favicon = '.sphinx/_static/favicon.png'
html_favicon = ".sphinx/_static/favicon.png"
html_context = {
# Product information
"product_page": "ubuntu.com/hpc",
"product_tag": "_static/tag.png",

# Chat and updates
"matrix": "https://matrix.to/#/#hpc:ubuntu.com",

# GitHub
"github_url": "https://github.com/charmed-hpc",
"github_repository": "docs",
Expand All @@ -30,18 +28,18 @@
"github_issues": "enabled",
"github_discussions": "https://github.com/orgs/charmed-hpc/discussions",
"github_qa": "https://github.com/orgs/charmed-hpc/discussions/new?category=q-a",

# Footer configuration
"sequential_nav": "both",
"display_contributors": True,
"display_contributors_since": ""
"display_contributors_since": "",
}

slug = ""
redirects = {}
linkcheck_ignore = [
"http://127.0.0.1:8000",
"https://matrix.to/#/#hpc:ubuntu.com",
"https://charmhub.io/topics/canonical-observability-stack/tutorials/install-microk8s#heading--deploy-the-cos-lite-bundle-with-overlays",
]
custom_linkcheck_anchors_ignore_for_url = []
custom_myst_extensions = []
Expand Down Expand Up @@ -72,7 +70,7 @@
# manpages_url = "https://manpages.ubuntu.com/manpages/noble/en/man{section}/{page}.{section}.html"

# Define a :center: role that can be used to center the content of table cells.
rst_prolog = '''
rst_prolog = """
.. role:: center
:class: align-center
'''
"""
5 changes: 3 additions & 2 deletions howto/getting-started/deploy-workload-manager.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
relatedlinks: "[Get started with LXD](https://documentation.ubuntu.com/lxd/en/latest/tutorial/first_steps/), [Get started with Juju](https://juju.is/docs/juju/tutorial), [Slurm website](https://slurm.schedmd.com/overview.html)"
---

(deploy-workload-manager)=
# Deploy workload manager

This guide shows you how to deploy the workload manager of your Charmed HPC cluster.
Expand All @@ -17,8 +18,8 @@ To successfully deploy the workload manager of your Charmed HPC cluster, you
will at least need:

- A machine running a [currently supported Ubuntu LTS version](https://ubuntu.com/about/release-cycle).
- An initialised [LXD](https://canonical.com/lxd) instance.
- A [Juju](https://juju.is) client.
- [An initialised LXD instance.](https://documentation.ubuntu.com/lxd/en/latest/howto/initialize/)
- The [Juju CLI client](https://juju.is/docs/juju/install-and-manage-the-client) installed on your machine.

## Initialise the machine cloud

Expand Down
2 changes: 1 addition & 1 deletion howto/getting-started/index.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Getting started

See the documentation in this section to get started with Charmed HPC.
See the how-to guides in this section for how to get started with Charmed HPC.

## How to deploy Charmed HPC

Expand Down
21 changes: 16 additions & 5 deletions howto/index.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,27 @@
(howtos)=
# How-to guides

These how-to guides cover key operations and common tasks in Charmed HPC.
## Getting started

## Get started
These how-to guides will get you started with Charmed HPC by
taking you through the deployment of your own Charmed HPC cluster.

To get started with Charmed HPC, you must deploy it on a supported
machine cloud.
- {ref}`deploy-workload-manager`

## Observability

These how-to guides cover how to connect your Charmed HPC
cluster to the [Canonical Observability Stack](https://charmhub.io/topics/canonical-observability-stack)
— also known as __COS__ — to observe cluster logs, metrics,
and alerts.

- {ref}`connect-workload-manager-to-cos`

```{toctree}
:titlesonly:
:maxdepth: 2
:maxdepth: 1
:hidden:

getting-started/index
observability/index
```
64 changes: 64 additions & 0 deletions howto/observability/connect-workload-manager-to-cos.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
relatedlinks: "[Get started with COS](https://charmhub.io/topics/canonical-observability-stack/tutorials/install-microk8s)"
---

(connect-workload-manager-to-cos)=
# Connect workload manager to COS

This how-to guide shows you how to connect your cluster's
workload manager to the Canonical Observability Stack to observe
the workload manager's logs, metrics, and a alerts.

## Prerequisites

To successfully connect your cluster's workload manager to COS, you must have:

- [A deployed COS cloud.](https://charmhub.io/topics/canonical-observability-stack/tutorials/install-microk8s)
- {ref}`A deployed workload manager. <deploy-workload-manager>`
- The [Juju CLI client](https://juju.is/docs/juju/install-and-manage-the-client) installed on your machine.

## Deploy an agent

First, in the model hosting your Charmed HPC cluster's workload manager,
deploy a Grafana agent:

```shell
juju deploy grafana-agent
```

## Connect the workload manager to the agent

After deploying the Grafana agent, connect the agent to the
workload manager controller:

```shell
juju integrate slurmctld:cos-agent grafana-agent:cos-agent
```

## Make COS available to the workload manager

With the agent connected to the workload manager controller, make COS available
to the model hosting the cluster's workload manager:

```{important}
For the instructions below to succeed, you must have deployed the
[`offers` overlay](https://charmhub.io/topics/canonical-observability-stack/tutorials/install-microk8s#heading--deploy-the-cos-lite-bundle-with-overlays)
as part of your COS cloud deployment.
```

```shell
juju consume microk8s:admin/cos.prometheus-receive-remote-write
juju consume microk8s:admin/cos.loki-logging
juju consume microk8s:admin/cos.grafana-dashboards
```

## Connect the workload manager to COS

Now connect the Grafana agent connected to the workload manager controller to
COS:

```shell
juju relate grafana-agent prometheus-receive-remote-write
juju relate grafana-agent loki-logging
juju relate grafana-agent grafana-dashboards
```
13 changes: 13 additions & 0 deletions howto/observability/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Observability

See the how-to guides in this section for how to connect your Charmed HPC
cluster to the [Canonical Observability Stack](https://charmhub.io/topics/canonical-observability-stack)
to observe the cluster's logs, metrics, and alerts.

```{toctree}
:titlesonly:
:maxdepth: 1

connect-workload-manager-to-cos
```