From 06fc1fd6a4c4712abd2b3b5549dd544b16b713dc Mon Sep 17 00:00:00 2001 From: Sherin Date: Mon, 23 Dec 2024 09:42:53 +0200 Subject: [PATCH 1/2] Update hotfixes-2-18.md --- docs/home/changelog/hotfixes-2-18.md | 50 +++++++++++++++++++--------- 1 file changed, 34 insertions(+), 16 deletions(-) diff --git a/docs/home/changelog/hotfixes-2-18.md b/docs/home/changelog/hotfixes-2-18.md index 1ecc54af57..eedad390fd 100644 --- a/docs/home/changelog/hotfixes-2-18.md +++ b/docs/home/changelog/hotfixes-2-18.md @@ -12,28 +12,46 @@ The following is a list of the known and fixed issues for Run:ai V2.18. | Internal ID | Hotfix # | Description | | :---- | :---- | :---- | -| RUN-23291 | 2.18.64 | CLI change text to be user friendly | -| RUN-23283 | 2.18.64 | Fixed a permissions issue with the Analytics dashboard post upgrade for SSO Users | -| RUN-23420 | 2.18.63 | Replaced Redis with Keydb | -| RUN-23140 | 2.18.63 | Fixed an issue where distributed workloads were created with the wrong types | -| RUN-23130 | 2.18.63 | Fixed an issue where inference-workload-controller crashed when WorkloadOwnershipProtection was enabled | -| RUN-23334 | 2.18.62 | Updated core Dockerfiles to ubi9 | -| RUN-23296 | 2.18.62 | Fixed an issue in the CLI where runai attach did not work with auto-complete | +| RUN-24020 | 2.18.77 | Fixed a security vulnerability in k8s.io.kubernetes with CVE CVE-2024-0793. | +| RUN-24021 | 2.18.77 | Fixed a security vulnerability in pam with CVE CVE-2024-10963. | +| RUN-23798 | 2.18.75 | Fixed an issue in distributed PyTorch workloads where the worker pods are deleted immediately after completion, not allowing logs to be viewed. | +| RUN-23838 | 2.18.74 | Fixed an issue where the command-line interface could not access resources when configured as single-sign on in a self-hosted environment. | +| RUN-23561 | 2.18.74 | Fixed an issue where the frontend in airgapped environment attempted to download font resources from the internet. | +| RUN-23789 | 2.18.73 | Fixed an issue where in some cases, it was not possible to download the latest version of the command line interface. | +| RUN-23790 | 2.18.73 | Fixed an issue where in some cases it was not possible to download the Windows version of the command line interface. | +| RUN-23855 | 2.18.73 | Fixed an issue where the pods list in the UI showed past pods. | +| RUN-23909 | 2.18.73 | Fixed an issue where users based on group permissions cannot see dashboards. | +| RUN-23857 | 2.18.72 | Dashboard to transition from Grafana v9 to v10. | +| RUN-24010 | 2.18.72 | Fixed an infinite loop issue in the cluster-sync service. | +| RUN-23040 | 2.18.72 | Fixed an edge case where the Run:ai container toolkit hangs when user is spawning hundreds of sub-processes. | +| RUN-23802 | 2.18.70 | Fixed an issue where new scheduling rules were not applied to existing workloads, if those new rules were set on existing projects which had no scheduling rules before. | +| RUN-23211 | 2.18.70 | Fixed an issue where workloads were stuck at "Pending" when the command-line interface flag --gpu-memory was set to zero. | +| RUN-23778 | 2.18.68 | Fixed an issue where in single-sign-on configuration, the mapping of UID and other properties would sometimes disappear. | +| RUN-23762 | 2.18.68 | Fixed an issue where the wrong version of a Grafana dashboard was displayed in the UI. | +| RUN-21198| 2.18.66 | Fixed an issue where creating a training workload via yaml (kubectl apply -f) and specifying spec.namePrefix, created infinite jobs. | +| RUN-23541 | 2.18.65 | Fixed an issue where in some cases workload authorization did not work properly due to wrong oidc configuration. | +| RUN-23291 | 2.18.64 | CLI change text to be user friendly. | +| RUN-23283 | 2.18.64 | Fixed a permissions issue with the Analytics dashboard post upgrade for SSO Users. | +| RUN-23420 | 2.18.63 | Replaced Redis with Keydb. | +| RUN-23140 | 2.18.63 | Fixed an issue where distributed workloads were created with the wrong types. | +| RUN-23130 | 2.18.63 | Fixed an issue where inference-workload-controller crashed when WorkloadOwnershipProtection was enabled. | +| RUN-23334 | 2.18.62 | Updated core Dockerfiles to ubi9. | +| RUN-23296 | 2.18.62 | Fixed an issue in the CLI where runai attach did not work with auto-complete. | | RUN-23215 | 2.18.62 | Fixed an issue where metrics requests from backend to mimir failed for certain tenants. | | RUN-22138 | 2.18.62 | Fixed an issue where private URL user(s) input was an email and not a string. | -| RUN-23282 | 2.18.61 | CLI documentation fixes | -| RUN-23055 | 2.18.60 | Fixed unified Distributed and Training CLI commands | -| RUN-23243 | 2.18.59 | Fixed an issue where the scope tree wasn't calculating permissions correctly | -| RUN-22463 | 2.18.59 | Fixed an error in CLI bash command | -| RUN-22314 | 2.18.59 | Fixed distributed framework filtering in API commands | -| RUN-23142 | 2.18.58 | Fixed an issue where advanced GPU metrics per-gpu don't have gpu label | +| RUN-23282 | 2.18.61 | CLI documentation fixes. | +| RUN-23055 | 2.18.60 | Fixed unified Distributed and Training CLI commands. | +| RUN-23243 | 2.18.59 | Fixed an issue where the scope tree wasn't calculating permissions correctly. | +| RUN-22463 | 2.18.59 | Fixed an error in CLI bash command. | +| RUN-22314 | 2.18.59 | Fixed distributed framework filtering in API commands. | +| RUN-23142 | 2.18.58 | Fixed an issue where advanced GPU metrics per-gpu don't have gpu label. | | RUN-23001 | 2.18.58 | Fixed an issue of false overcommit on out-of-memory killed in the “swap” feature. | -| RUN-22851 | 2.18.58 | Fixed an issue where client may get stuck on device lock acquired during “swap” out-migration | +| RUN-22851 | 2.18.58 | Fixed an issue where client may get stuck on device lock acquired during “swap” out-migration. | | RUN-22758 | 2.18.58 | Fixed an issue where inference workload showed wrong status when submission failed. | | RUN-22544 | 2.18.58 | Updated Grafana version for security vulnerabilities. | -| RUN-23055 | 2.18.57 | Fixed the unified Distributed and Training CLI commands | +| RUN-23055 | 2.18.57 | Fixed the unified Distributed and Training CLI commands. | | RUN-23014 | 2.18.56 | Fixed an issue where node-scale-adjuster might not create a scaling pod if it is in cool-down and the pod was not updated after that. | -| RUN-22660 | 2.18.56 | Fixed an issue where workload charts have an unclear state | +| RUN-22660 | 2.18.56 | Fixed an issue where workload charts have an unclear state. | | RUN-22457 |2.18.55 | Fixed an issue where in rare edge cases the cluster-sync pod was out of memory. | | RUN-21825 |2.18.55 | Fixed all CVEs in Run:ai's Goofys-based image used for S3 integration. | | RUN-22871 |2.18.55 | Fixed an issue in runai-container-toolkit where in certain cases when a process is preempted, OOMKill metrics were not published correctly. | From f5c6d15d4a7a5d24bcf28209529c8d968ac67e6c Mon Sep 17 00:00:00 2001 From: Sherin Date: Mon, 23 Dec 2024 13:48:40 +0200 Subject: [PATCH 2/2] Update hotfixes-2-18.md --- docs/home/changelog/hotfixes-2-18.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/home/changelog/hotfixes-2-18.md b/docs/home/changelog/hotfixes-2-18.md index eedad390fd..54e7a3de89 100644 --- a/docs/home/changelog/hotfixes-2-18.md +++ b/docs/home/changelog/hotfixes-2-18.md @@ -28,9 +28,8 @@ The following is a list of the known and fixed issues for Run:ai V2.18. | RUN-23211 | 2.18.70 | Fixed an issue where workloads were stuck at "Pending" when the command-line interface flag --gpu-memory was set to zero. | | RUN-23778 | 2.18.68 | Fixed an issue where in single-sign-on configuration, the mapping of UID and other properties would sometimes disappear. | | RUN-23762 | 2.18.68 | Fixed an issue where the wrong version of a Grafana dashboard was displayed in the UI. | -| RUN-21198| 2.18.66 | Fixed an issue where creating a training workload via yaml (kubectl apply -f) and specifying spec.namePrefix, created infinite jobs. | +| RUN-21198 | 2.18.66 | Fixed an issue where creating a training workload via yaml (kubectl apply -f) and specifying spec.namePrefix, created infinite jobs. | | RUN-23541 | 2.18.65 | Fixed an issue where in some cases workload authorization did not work properly due to wrong oidc configuration. | -| RUN-23291 | 2.18.64 | CLI change text to be user friendly. | | RUN-23283 | 2.18.64 | Fixed a permissions issue with the Analytics dashboard post upgrade for SSO Users. | | RUN-23420 | 2.18.63 | Replaced Redis with Keydb. | | RUN-23140 | 2.18.63 | Fixed an issue where distributed workloads were created with the wrong types. |