Skip to content

Commit

Permalink
Revert "deprecate old versions, improve install & upgrade docs"
Browse files Browse the repository at this point in the history
  • Loading branch information
JamieWeider72 authored Aug 7, 2024
1 parent 1a345e6 commit a2af017
Show file tree
Hide file tree
Showing 6 changed files with 138 additions and 41 deletions.
18 changes: 16 additions & 2 deletions docs/admin/runai-setup/cluster-setup/cluster-delete.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,23 @@
# Deleting a Cluster Installation

To delete a Run:ai Cluster installation run the following commands:
To delete a Run:ai Cluster installation while retaining existing running jobs, run the following commands:

=== "Version 2.9 or later"
```
helm uninstall runai-cluster -n runai
```

The command will **not** delete existing Projects, Departments, or Workloads submitted by users.
=== "Version 2.8"
```
kubectl delete RunaiConfig runai -n runai
helm uninstall runai-cluster -n runai
```

=== "Version 2.7 or earlier"
```
kubectl patch RunaiConfig runai -n runai -p '{"metadata":{"finalizers":[]}}' --type="merge"
kubectl delete RunaiConfig runai -n runai
helm uninstall runai-cluster runai -n runai
```

The commands will **not** delete existing Jobs submitted by users.
6 changes: 2 additions & 4 deletions docs/admin/runai-setup/cluster-setup/cluster-install.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,13 @@ Using the cluster wizard:

* Choose a name for your cluster.
* Choose the Run:ai version for the cluster.
* (v2.13 only) Choose a target Kubernetes distribution.

* Choose a target Kubernetes distribution (see [table](cluster-prerequisites.md#kubernetes) for supported distributions).
* (SaaS and remote self-hosted cluster only) Enter a URL for the Kubernetes cluster. The URL need only be accessible within the organization's network. For more informtaion see [here](cluster-prerequisites.md#cluster-url).
* Press `Continue`.

On the next page:

* (SaaS and remote self-hosted cluster only) Install a [trusted certificate](cluster-prerequisites.md#cluster-url) to the domain entered above.

* (SaaS and remote self-hosted cluster only) Install a trusted certificate to the domain entered above.
* Run the [Helm](https://helm.sh/docs/intro/install/) command provided in the wizard.
* In case of a failure, see the [Installation troubleshooting guide](../../troubleshooting/troubleshooting.md#installation).

Expand Down
3 changes: 3 additions & 0 deletions docs/admin/runai-setup/cluster-setup/cluster-prerequisites.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,10 @@ Following is a Kubernetes support matrix for the latest Run:ai releases:<a name=

| Run:ai version | Supported Kubernetes versions | Supported OpenShift versions |
|----------------|-------------------------------|--------|
| Run:ai 2.9 | 1.21 through 1.26 | 4.8 through 4.11 |
| Run:ai 2.10 | 1.21 through 1.26 (see note below) | 4.8 through 4.11 |
| Run:ai 2.13 | 1.23 through 1.28 (see note below) | 4.10 through 4.13 |
| Run:ai 2.15 | 1.25 through 1.28 | 4.11 through 4.13 |
| Run:ai 2.16 | 1.26 through 1.28 | 4.11 through 4.14 |
| Run:ai 2.17 | 1.27 through 1.29 | 4.12 through 4.15 |
| Run:ai 2.18 | 1.28 through 1.30 | 4.12 through 4.16 |
Expand Down
93 changes: 65 additions & 28 deletions docs/admin/runai-setup/cluster-setup/cluster-upgrade.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,76 @@

# Upgrading a Cluster Installation
Below are instructions on how to upgrade a Run:ai cluster.

## Upgrade Run:ai cluster
Follow the steps bellow, based on the Run:ai Cluster version you would like to upgrade to:
## Find out Run:ai Cluster version

=== "2.15-latest"
* In the Run:ai interface, navigate to `Clusters`.
* Select the cluster you want to upgrade.
* Click on `Get Installation instructions`.
* Choose the `Run:ai version` to upgrade to.
* Press `Continue`.
* Copy the [Helm](https://helm.sh/docs/intro/install/) command provided in the `Installation Instructions` and run it on in the cluster.
* In the case of a failure, refer to the [Installation troubleshooting guide](../../troubleshooting/troubleshooting.md#installation).
To find the Run:ai cluster version, run:

=== "2.13"
Run:
```
helm list -n runai -f runai-cluster
```

and record the chart version in the form of `runai-cluster-<version-number>`

## Upgrade Run:ai cluster

### Upgrade from version 2.15+
* In the Run:ai interface, navigate to `Clusters`.
* Select the cluster you want to upgrade.
* Click on `Get Installation instructions`.
* Choose the `Run:ai version` to be installed on the Cluster.
* Press `Continue`.
* Copy the [Helm](https://helm.sh/docs/intro/install/) command provided in the `Installation Instructions` and run it on in the cluster.
* In the case of a failure, refer to the [Installation troubleshooting guide](../../troubleshooting/troubleshooting.md#installation).

### Upgrade from version 2.9, 2.10, 2.11 or 2.12
Run:

```
helm get values runai-cluster -n runai > old-values.yaml
```

1. Review the file `old-values.yaml` and see if there are any changes performed during the last installation.
2. Follow the instructions for [Installing Run:ai](cluster-install.md#install-runai) to download a new values file.
3. Merge the changes from Step 1 into the new values file.
4. Run `helm upgrade` as per the instructions in the link above.


!!! Note
To upgrade to a __specific__ version of the Run:ai cluster, add `--version <version-number>` to the `helm upgrade` command. You can find the relevant version with `helm search repo` as described above.

### Upgrade from version 2.7 or 2.8

The process of upgrading from 2.7 or 2.8 requires [uninstalling](./cluster-delete.md) and then [installing](./cluster-install.md) again. No data is lost during the process.

!!! Note
The reason for this process is that Run:ai 2.9 cluster installation no longer installs pre-requisites. As such ownership of dependencies such as Prometheus will be undefined if a `helm upgrade` is run.

The process:

* Delete the Run:ai cluster installation according to these [instructions](cluster-delete.md) (do not delete the Run:ai cluster __object__ from the user interface).
* The following commands should be executed __after__ running the helm uninstall command
```
helm get values runai-cluster -n runai > old-values.yaml
```
kubectl -n runai delete all --all
kubectl -n runai delete cm --all
kubectl -n runai delete secret --all
kubectl -n runai delete roles --all
kubectl -n runai delete rolebindings --all
kubectl -n runai delete ingress --all
kubectl -n runai delete servicemonitors --all
kubectl -n runai delete podmonitors --all
kubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io -l app=runai
kubectl delete mutatingwebhookconfigurations.admissionregistration.k8s.io -l app=runai
kubectl delete svc -n kube-system runai-cluster-kube-prometh-kubelet
```
* Install the mandatory Run:ai [prerequisites](cluster-prerequisites.md):
* If you have previously installed the SaaS version of Run:ai version 2.7 or below, you will need to install both [Ingress Controller](cluster-prerequisites.md#ingress-controller) and [Prometheus](cluster-prerequisites.md#prometheus).
* If you have previously installed the SaaS version of Run:ai version 2.8 or any Self-hosted version of Run:ai, you will need to install [Prometheus](cluster-prerequisites.md#prometheus) only.
* Install Run:ai cluster as described [here](cluster-install.md)
## Verify Successful Installation
* Review the file `old-values.yaml` and see if there are any changes performed during the last installation.
* In the Run:ai interface, navigate to `Clusters`.
* Select the cluster you want to upgrade.
* Click on `Get Installation instructions`.
* Select `Run:ai version: 2.13`.
* Select the `cluster's Kubernetes distribution` and the `Cluster location`
* If the Cluster locaiton is remote to the control plane - Enter a URL for the Kubernetes cluster.
* Press `Continue`.
* Follow the instructions to download a new values file.
* Merge the changes from Step 1 into the new values file.
* Copy the [Helm](https://helm.sh/docs/intro/install/) command provided in the `Installation Instructions` and run it on in the cluster.

## Verify Successful Upgrade
See [Verify your installation](cluster-install.md#verify-your-clusters-health) on how to verify a Run:ai cluster installation
Expand Down
37 changes: 34 additions & 3 deletions docs/admin/runai-setup/self-hosted/k8s/upgrade.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,33 @@ If you are installing an air-gapped version of Run:ai, The Run:ai tar file conta

Before proceeding with the upgrade, it's crucial to apply the specific prerequisites associated with your current version of Run:ai and every version in between up to the version you are upgrading to.

### Upgrade from version 2.9
### Upgrade from version 2.7 or 2.8

Before upgrading the control plane, run:

``` bash
POSTGRES_PV=$(kubectl get pvc pvc-postgresql -n runai-backend -o jsonpath='{.spec.volumeName}')
THANOS_PV=$(kubectl get pvc pvc-thanos-receive -n runai-backend -o jsonpath='{.spec.volumeName}')
kubectl patch pv $POSTGRES_PV $THANOS_PV -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

kubectl delete secret -n runai-backend runai-backend-postgresql
kubectl delete sts -n runai-backend keycloak runai-backend-postgresql
```

Before version 2.9, the Run:ai installation, by default, included NGINX. It was possible to disable this installation. If NGINX is enabled in your current installation, as per the default, run the following 2 lines:

``` bash
kubectl delete ValidatingWebhookConfiguration runai-backend-nginx-ingress-admission
kubectl delete ingressclass nginx
```
(If Run:ai configuration has previously disabled NGINX installation then these lines should not be run).

Next, install NGINX as described [here](../../cluster-setup/cluster-prerequisites.md#ingress-controller)

Then create a TLS secret and upgrade the control plane as described in the [control plane installation](backend.md). Before upgrading, find customizations and merge them as discussed below.


### Upgrade from version 2.9, 2.10 , or 2.11

Two significant changes to the control-plane installation have happened with version 2.12: _PVC ownership_ and _installation customization_.

Expand All @@ -55,6 +81,7 @@ To remove the ownership in an older installation, run:
kubectl patch pvc -n runai-backend pvc-thanos-receive -p '{"metadata": {"annotations":{"helm.sh/resource-policy": "keep"}}}'
kubectl patch pvc -n runai-backend pvc-postgresql -p '{"metadata": {"annotations":{"helm.sh/resource-policy": "keep"}}}'
```

#### Ingress

Delete the ingress object which will be recreated by the control plane upgrade
Expand All @@ -71,7 +98,9 @@ The Run:ai control-plane installation has been rewritten and is no longer using


## Upgrade Control Plane

### Upgrade from version 2.13, or later

=== "Connected"

``` bash
Expand All @@ -84,7 +113,9 @@ The Run:ai control-plane installation has been rewritten and is no longer using
helm get values runai-backend -n runai-backend > runai_control_plane_values.yaml
helm upgrade runai-backend control-plane-<NEW-VERSION>.tgz -n runai-backend -f runai_control_plane_values.yaml --reset-then-reuse-values
```
### Upgrade from version 2.9

### Upgrade from version 2.7, 2.8, 2.9, or 2.11

* Create a `tls secret` as described in the [control plane installation](backend.md).
* Upgrade the control plane as described in the [control plane installation](backend.md). During the upgrade, you must tell the installation __not__ to create the two PVCs:

Expand All @@ -111,4 +142,4 @@ The Run:ai control-plane installation has been rewritten and is no longer using

## Upgrade Cluster

To upgrade the cluster follow the instructions [here](../../cluster-setup/cluster-upgrade.md).
To upgrade the cluster follow the instructions [here](../../cluster-setup/cluster-upgrade.md).
22 changes: 18 additions & 4 deletions docs/admin/runai-setup/self-hosted/ocp/upgrade.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,23 @@
title: Upgrade self-hosted OpenShift installation
---
# Upgrade Run:ai


!!! Important
Run:ai data is stored in Kubernetes persistent volumes (PVs). Prior to Run:ai 2.12, PVs are owned by the Run:ai installation. Thus, uninstalling the `runai-backend` helm chart may delete all of your data.

From version 2.12 forward, PVs are owned the customer and are independent of the Run:ai installation. As such, they are subject to storage class [reclaim](https://kubernetes.io/docs/concepts/storage/storage-classes/#reclaim-policy){target=_blank} policy.

## Preparations

### Helm
Run:ai requires [Helm](https://helm.sh/){target=_blank} 3.14 or later.
Before you continue, validate your installed helm client version.
To install or upgrade Helm, see [Installing Helm](https://helm.sh/docs/intro/install/){target=_blank}.
If you are installing an air-gapped version of Run:ai, The Run:ai tar file contains the helm binary.

### Software files

=== "Connected"
Run the helm command below:

Expand All @@ -28,8 +32,11 @@ If you are installing an air-gapped version of Run:ai, The Run:ai tar file conta
* Upload the images as described [here](preparations.md#software-artifacts).

## Before upgrade

Before proceeding with the upgrade, it's crucial to apply the specific prerequisites associated with your current version of Run:ai and every version in between up to the version you are upgrading to.

### Upgrade from version 2.7 or 2.8

Before upgrading the control plane, run:

``` bash
Expand All @@ -42,10 +49,11 @@ kubectl delete sts -n runai-backend keycloak runai-backend-postgresql

Then upgrade the control plane as described [below](#upgrade-control-plane). Before upgrading, find customizations and merge them as discussed below.

### Upgrade from version 2.9
Two significant changes to the control-plane installation have happened with version 2.12: _PVC ownership_ and _installation customization_.
### Upgrade from version 2.9, 2.10 or 2.11

Two significant changes to the control-plane installation have happened with version 2.12: _PVC ownership_ and _installation customization_.
#### PVC ownership

Run:ai will no longer directly create the PVCs that store Run:ai data (metrics and database). Instead, going forward,

* Run:ai requires a Kubernetes storage class to be installed.
Expand All @@ -60,13 +68,17 @@ kubectl patch pvc -n runai-backend pvc-postgresql -p '{"metadata": {"annotation
```

#### Installation customization

The Run:ai control-plane installation has been rewritten and is no longer using a _backend values file_. Instead, to customize the installation use standard `--set` flags. If you have previously customized the installation, you must now extract these customizations and add them as `--set` flag to the helm installation:

* Find previous customizations to the control plane if such exist. Run:ai provides a utility for that here `https://raw.githubusercontent.com/run-ai/docs/v2.13/install/backend/cp-helm-vals-diff.sh`. For information on how to use this utility please contact Run:ai customer support.
* Search for the customizations you found in the [optional configurations](./backend.md#additional-runai-configurations-optional) table and add them in the new format.


## Upgrade Control Plane

### Upgrade from version 2.13, or later

=== "Connected"

``` bash
Expand All @@ -79,7 +91,8 @@ The Run:ai control-plane installation has been rewritten and is no longer using
helm get values runai-backend -n runai-backend > runai_control_plane_values.yaml
helm upgrade runai-backend control-plane-<NEW-VERSION>.tgz -n runai-backend -f runai_control_plane_values.yaml --reset-then-reuse-values
```
### Upgrade from version 2.9

### Upgrade from version 2.7, 2.8, 2.9, or 2.11

=== "Connected"

Expand Down Expand Up @@ -111,4 +124,5 @@ The Run:ai control-plane installation has been rewritten and is no longer using
1. The subdomain configured for the OpenShift cluster.

## Upgrade Cluster
To upgrade the cluster follow the instructions [here](../../cluster-setup/cluster-upgrade.md).

To upgrade the cluster follow the instructions [here](../../cluster-setup/cluster-upgrade.md).

0 comments on commit a2af017

Please sign in to comment.