Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge pull request #935 from run-ai/policies-mess2 #936

Merged
merged 1 commit into from
Aug 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 20 additions & 15 deletions docs/admin/workloads/policies/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,35 +8,40 @@ date: 2023-Dec-12

## Introduction

*Policies* allow administrators to impose restrictions and set default values for researcher workloads. Restrictions and default values can be placed on CPUs, GPUs, and other resources or entities. Enabling the *New Policy Manager* provides information about resources that are non-compliant to applied policies. Resources that are non-compliant will appear greyed out. To see how a resource is not compliant, press on the clipboard icon in the upper right-hand corner of the resource.
*Policies* allow administrators to impose restrictions and set default values for researcher workloads. Restrictions and default values can be placed on CPUs, GPUs, and other resources or entities.

!!! Note
Policies from Run:ai versions 2.17 or lower will still work after enabling the New Policy Manager. For more information about policies for version 2.17 or lower, see [What are Policies](policies.md#what-are-policies).
Examples:

For example, an administrator can create and apply a policy that will restrict researchers from requesting more than 2 GPUs, or less than 1GB of memory per type of workload.

Another example is an administrator who wants to set different amounts of CPU, GPUs and memory for different kinds of workloads. A training workload can have a default of 1 GB of memory, or an interactive workload can have a default amount of GPUs.
* An administrator can create and apply a policy that will restrict researchers from requesting more than 2 GPUs, or less than 1GB of memory per type of workload.
* An administrator who wants to set different amounts of CPU, GPUs and memory for different kinds of workloads. A training workload can have a default of 1 GB of memory, or an interactive workload can have a default amount of GPUs.

Policies are created for each Run:ai project (Kubernetes namespace). When a policy is created in the `runai` namespace, it will take effect when there is no project-specific policy for workloads of the same kind.

In interactive workloads or workspaces, applied policies will only allow researchers access to resources that are permitted in the policy. This can include compute resources as well as node pools and node pool priority.
When using workspaces, applied policies will only allow researchers access to resources that are permitted in the policy. This can include compute resources as well as node pools and node pool priority.

## Older and Newer Policy technologies

Run:ai provides two policy technologies.

[**YAML-Based policies**](policies.md) are the older policies. These policies:

* Require access to Kubernetes to view or change.
* Do not manifest themselves in the Run:ai user interface and can thus create unexpected side effects.

To enable the new *Policy Manager*:
[**API-based policies**](workspaces-policy.md) which are the newer policies. These are:

1. Press the *Tools and Settings* icon, then press *General*.
2. Toggle the *New Policy Manager* switch to on.
* Show in the Run:ai user interface
* Can be viewed and modified via the user interface and the Control-plane API
* Only available with Run:ai clusters of version 2.18 and up.

To return to the previous *Policy Manager* toggle the switch off.

## Run:ai Policies vs. Kyverno Policies

Kyverno runs as a dynamic admission controller in a Kubernetes cluster. Kyverno receives validating and mutating admission webhook HTTP callbacks from the Kubernetes API server and applies matching policies to return results that enforce admission policies or reject requests. Kyverno policies can match resources using the resource kind, name, label selectors, and much more. For more information, see [How Kyverno Works](https://kyverno.io/docs/introduction/#how-kyverno-works){target=_blank}.

## Policy Details

For details on how to set a policy see [New Policies](workspaces-policy.md).

### Policy Inheritance
## Policy Inheritance

A policy configured to a specific scope, is applied to all elements in that scope. You can add more policy restrictions to individual elements in the scope in order to override the base policy or add more restrictions.
A policy configured with a specific `scope`. The policy is applied to all elements in that scope. You can add more policy restrictions to individual elements in the scope to override the base policy or add more restrictions.

23 changes: 22 additions & 1 deletion docs/admin/workloads/policies/workspaces-policy.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,30 @@ summary: This article outlines what is a policy and details the variables that

---

## Enabling the Policy Manager

To use API-based Policies you need to enable the *New Policy Manager*. The policy manager provides information about resources that are non-compliant with the applied policies.


To enable the new *Policy Manager*:

1. Press the *Tools and Settings* icon, then press *General*.
2. Toggle the *New Policy Manager* switch to on.

To return to the previous *Policy Manager* toggle the switch off.

!!! Note
Using the new, API-based Policies, will not disable the older [YAML-based](./policies.md) policies.


## Viewing Policy compliance

A Policy places resource restrictions and defaults on Workloads in the Run:ai platform. Restrictions and default values can be placed on CPUs, GPUs, and other resources or entities.

## Example
Non-compliant resources (e.g. data sources, compute resources) will appear greyed out. To see how a resource is not compliant, press on the clipboard icon in the upper right-hand corner of the resource.


## Example Policy

Below is an example policy you can use in your platform.

Expand Down
3 changes: 3 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,9 @@ plugins:
'admin/admin-ui-setup/dashboard-analysis.md' : 'admin/performance/dashboard-analysis.md'
'index.md' : 'home/overview.md'
'admin/runai-setup/config/non-root-containers.md' : 'admin/authentication/non-root-containers.md'
'admin/workloads/policies/README.md' : 'admin/workloads/policies/overview.md'
# 'admin/workloads/policies' : 'admin/workloads/policies/overview.md'

nav:
- Home:
- 'Overview': 'home/overview.md'
Expand Down
Loading