Skip to content

Commit

Permalink
Merge pull request #1320 from run-ai/Hotfixes_2.18
Browse files Browse the repository at this point in the history
Update hotfixes-2-18.md
  • Loading branch information
SherinDaher-Runai authored Dec 23, 2024
2 parents 1ed3ef9 + f5c6d15 commit 98431aa
Showing 1 changed file with 33 additions and 16 deletions.
49 changes: 33 additions & 16 deletions docs/home/changelog/hotfixes-2-18.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,28 +12,45 @@ The following is a list of the known and fixed issues for Run:ai V2.18.

| Internal ID | Hotfix # | Description |
| :---- | :---- | :---- |
| RUN-23291 | 2.18.64 | CLI change text to be user friendly |
| RUN-23283 | 2.18.64 | Fixed a permissions issue with the Analytics dashboard post upgrade for SSO Users |
| RUN-23420 | 2.18.63 | Replaced Redis with Keydb |
| RUN-23140 | 2.18.63 | Fixed an issue where distributed workloads were created with the wrong types |
| RUN-23130 | 2.18.63 | Fixed an issue where inference-workload-controller crashed when WorkloadOwnershipProtection was enabled |
| RUN-23334 | 2.18.62 | Updated core Dockerfiles to ubi9 |
| RUN-23296 | 2.18.62 | Fixed an issue in the CLI where runai attach did not work with auto-complete |
| RUN-24020 | 2.18.77 | Fixed a security vulnerability in k8s.io.kubernetes with CVE CVE-2024-0793. |
| RUN-24021 | 2.18.77 | Fixed a security vulnerability in pam with CVE CVE-2024-10963. |
| RUN-23798 | 2.18.75 | Fixed an issue in distributed PyTorch workloads where the worker pods are deleted immediately after completion, not allowing logs to be viewed. |
| RUN-23838 | 2.18.74 | Fixed an issue where the command-line interface could not access resources when configured as single-sign on in a self-hosted environment. |
| RUN-23561 | 2.18.74 | Fixed an issue where the frontend in airgapped environment attempted to download font resources from the internet. |
| RUN-23789 | 2.18.73 | Fixed an issue where in some cases, it was not possible to download the latest version of the command line interface. |
| RUN-23790 | 2.18.73 | Fixed an issue where in some cases it was not possible to download the Windows version of the command line interface. |
| RUN-23855 | 2.18.73 | Fixed an issue where the pods list in the UI showed past pods. |
| RUN-23909 | 2.18.73 | Fixed an issue where users based on group permissions cannot see dashboards. |
| RUN-23857 | 2.18.72 | Dashboard to transition from Grafana v9 to v10. |
| RUN-24010 | 2.18.72 | Fixed an infinite loop issue in the cluster-sync service. |
| RUN-23040 | 2.18.72 | Fixed an edge case where the Run:ai container toolkit hangs when user is spawning hundreds of sub-processes. |
| RUN-23802 | 2.18.70 | Fixed an issue where new scheduling rules were not applied to existing workloads, if those new rules were set on existing projects which had no scheduling rules before. |
| RUN-23211 | 2.18.70 | Fixed an issue where workloads were stuck at "Pending" when the command-line interface flag --gpu-memory was set to zero. |
| RUN-23778 | 2.18.68 | Fixed an issue where in single-sign-on configuration, the mapping of UID and other properties would sometimes disappear. |
| RUN-23762 | 2.18.68 | Fixed an issue where the wrong version of a Grafana dashboard was displayed in the UI. |
| RUN-21198 | 2.18.66 | Fixed an issue where creating a training workload via yaml (kubectl apply -f) and specifying spec.namePrefix, created infinite jobs. |
| RUN-23541 | 2.18.65 | Fixed an issue where in some cases workload authorization did not work properly due to wrong oidc configuration. |
| RUN-23283 | 2.18.64 | Fixed a permissions issue with the Analytics dashboard post upgrade for SSO Users. |
| RUN-23420 | 2.18.63 | Replaced Redis with Keydb. |
| RUN-23140 | 2.18.63 | Fixed an issue where distributed workloads were created with the wrong types. |
| RUN-23130 | 2.18.63 | Fixed an issue where inference-workload-controller crashed when WorkloadOwnershipProtection was enabled. |
| RUN-23334 | 2.18.62 | Updated core Dockerfiles to ubi9. |
| RUN-23296 | 2.18.62 | Fixed an issue in the CLI where runai attach did not work with auto-complete. |
| RUN-23215 | 2.18.62 | Fixed an issue where metrics requests from backend to mimir failed for certain tenants. |
| RUN-22138 | 2.18.62 | Fixed an issue where private URL user(s) input was an email and not a string. |
| RUN-23282 | 2.18.61 | CLI documentation fixes |
| RUN-23055 | 2.18.60 | Fixed unified Distributed and Training CLI commands |
| RUN-23243 | 2.18.59 | Fixed an issue where the scope tree wasn't calculating permissions correctly |
| RUN-22463 | 2.18.59 | Fixed an error in CLI bash command |
| RUN-22314 | 2.18.59 | Fixed distributed framework filtering in API commands |
| RUN-23142 | 2.18.58 | Fixed an issue where advanced GPU metrics per-gpu don't have gpu label |
| RUN-23282 | 2.18.61 | CLI documentation fixes. |
| RUN-23055 | 2.18.60 | Fixed unified Distributed and Training CLI commands. |
| RUN-23243 | 2.18.59 | Fixed an issue where the scope tree wasn't calculating permissions correctly. |
| RUN-22463 | 2.18.59 | Fixed an error in CLI bash command. |
| RUN-22314 | 2.18.59 | Fixed distributed framework filtering in API commands. |
| RUN-23142 | 2.18.58 | Fixed an issue where advanced GPU metrics per-gpu don't have gpu label. |
| RUN-23001 | 2.18.58 | Fixed an issue of false overcommit on out-of-memory killed in the “swap” feature. |
| RUN-22851 | 2.18.58 | Fixed an issue where client may get stuck on device lock acquired during “swap” out-migration |
| RUN-22851 | 2.18.58 | Fixed an issue where client may get stuck on device lock acquired during “swap” out-migration. |
| RUN-22758 | 2.18.58 | Fixed an issue where inference workload showed wrong status when submission failed. |
| RUN-22544 | 2.18.58 | Updated Grafana version for security vulnerabilities. |
| RUN-23055 | 2.18.57 | Fixed the unified Distributed and Training CLI commands |
| RUN-23055 | 2.18.57 | Fixed the unified Distributed and Training CLI commands. |
| RUN-23014 | 2.18.56 | Fixed an issue where node-scale-adjuster might not create a scaling pod if it is in cool-down and the pod was not updated after that. |
| RUN-22660 | 2.18.56 | Fixed an issue where workload charts have an unclear state |
| RUN-22660 | 2.18.56 | Fixed an issue where workload charts have an unclear state. |
| RUN-22457 |2.18.55 | Fixed an issue where in rare edge cases the cluster-sync pod was out of memory. |
| RUN-21825 |2.18.55 | Fixed all CVEs in Run:ai's Goofys-based image used for S3 integration. |
| RUN-22871 |2.18.55 | Fixed an issue in runai-container-toolkit where in certain cases when a process is preempted, OOMKill metrics were not published correctly. |
Expand Down

0 comments on commit 98431aa

Please sign in to comment.