diff --git a/docs/Researcher/scheduling/gpu-memory-swap.md b/docs/Researcher/scheduling/gpu-memory-swap.md index a379851f1f..df95c91a2e 100644 --- a/docs/Researcher/scheduling/gpu-memory-swap.md +++ b/docs/Researcher/scheduling/gpu-memory-swap.md @@ -115,4 +115,4 @@ If you prefer your workloads not to be swapped into CPU memory, you can specify CPU memory is limited, and since a single CPU serves multiple GPUs on a node, this number is usually between 2 to 8. For example, when using 80GB of GPU memory, each swapped workload consumes up to 80GB (but may use less) assuming each GPU is shared between 2-4 workloads. In this example, you can see how the swap memory can become very large. Therefore, we give administrators a way to limit the size of the CPU reserved memory for swapped GPU memory on each swap enabled node. -Limiting the CPU reserved memory means that there may be scenarios where the GPU memory cannot be swapped out to the CPU reserved RAM. Whenever the CPU reserved memory for swapped GPU memory is exhausted, the workloads currently running will not be swapped out to the CPU reserved RAM, instead, *Node Level Scheduler* and *Dynamic Fractions* logic takes over and provides GPU resource optimization.see [Dynamic Fractions](fractions.md#dynamic-mig) and [Node Level Scheduler](node-level-scheduler.md#how-to-configure-node-level-scheduler). +Limiting the CPU reserved memory means that there may be scenarios where the GPU memory cannot be swapped out to the CPU reserved RAM. Whenever the CPU reserved memory for swapped GPU memory is exhausted, the workloads currently running will not be swapped out to the CPU reserved RAM, instead, *Node Level Scheduler* logic takes over and provides GPU resource optimization. See [Node Level Scheduler](node-level-scheduler.md#how-to-configure-node-level-scheduler). diff --git a/docs/Researcher/workloads/trainings.md b/docs/Researcher/workloads/trainings.md index d2261caaf8..8bb6b49145 100644 --- a/docs/Researcher/workloads/trainings.md +++ b/docs/Researcher/workloads/trainings.md @@ -55,7 +55,7 @@ To add a training: 5. Enter the *Container path* for volume target location. 6. Select a *Volume persistency. -9. (Optional) In the *Data sources* pane, press *add a new data source*. For more information, see [Creating a new data source](../workloads/assets/datasources.md#create-a-new-data-source) When complete press, *Create Data Source*. +9. (Optional) In the *Data sources* pane, press *add a new data source*. For more information, see [Creating a new data source](../workloads/assets/datasources.md#adding-a-new-data-source) When complete press, *Create Data Source*. 10. (Optional) In the *General* pane, add special settings for your training (optional): 1. Press *Auto-deletion* to delete the training automatically when it either completes or fails. You can configure the timeframe in days, hours, minutes, and seconds. If the timeframe is set to 0, the training will be deleted immediately after it completes or fails. (default = 30 days) @@ -77,7 +77,7 @@ To add a training: 5. Enter the *Container path* for volume target location. 6. Select a *Volume persistency. - 4. (Optional) In the *Data sources* pane, press *add a new data source*. For more information, see [Creating a new data source](../workloads/assets/datasources.md#create-a-new-data-source) When complete press, *Create Data Source*. + 4. (Optional) In the *Data sources* pane, press *add a new data source*. For more information, see [Creating a new data source](../workloads/assets/datasources.md#adding-a-new-data-source) When complete press, *Create Data Source*. 5. (Optional) In the *General* pane, add special settings for your training (optional): 1. Press *Auto-deletion* to delete the training automatically when it either completes or fails. You can configure the timeframe in days, hours, minutes, and seconds. If the timeframe is set to 0, the training will be deleted immediately after it completes or fails. (default = 30 days) diff --git a/docs/admin/config/create-k8s-assets-in-advance.md b/docs/admin/config/create-k8s-assets-in-advance.md new file mode 100644 index 0000000000..ebb996ffb3 --- /dev/null +++ b/docs/admin/config/create-k8s-assets-in-advance.md @@ -0,0 +1,48 @@ +# Creating Kubernetes Assets in Advance + +The article describe how to mark Kubernetes assets for use by Run:ai + +## Creating PVCs in advance + +Add PVCs in advance to be used when creating a PVC-type data source via the Run:ai UI. + +Follow the steps below for each required scope: + + +### Cluster scope + +1. Create the PVC in the Run:ai namespace (runai) +2. To authorize Run:ai to use the PVC, label it: `run.ai/cluster-wide: "true”` + The PVC is now displayed for that scope in the list of existing PVCs. + +### Department scope + +1. Create the PVC in the Run:ai namespace (runai) +2. To authorize Run:ai to use the PVC, label it: `run.ai/department: "\\"` + The PVC is now displayed for that scope in the list of existing PVCs. + +### Project scope + +1. Create the PVC in the project’s namespace + The PVC is now displayed for that scope in the list of existing PVCs. + +## Creating ConfigMaps in advance + +Add ConfigMaps in advance to be used when creating a ConfigMap-type data source via the Run:ai UI. + +### Cluster scope + +1. Create the ConfigMap in the Run:ai namespace (runai) +2. To authorize Run:ai to use the ConfigMap, label it: run.ai/cluster-wide: "true” + The ConfigMap is now displayed for that scope in the list of existing ConfigMaps. + +### Department scope + +1. Create the ConfigMap in the Run:ai namespace (runai) +2. To authorize Run:ai to use the ConfigMap, label it: `run.ai/department: "\\"` + The ConfigMap is now displayed for that scope in the list of existing ConfigMaps. + +### Project scope + +1. Create the ConfigMap in the project’s namespace + The ConfigMap is now displayed for that scope in the list of existing ConfigMaps. diff --git a/docs/admin/config/large-clusters.md b/docs/admin/config/large-clusters.md index fc8c75ad7c..497a903057 100644 --- a/docs/admin/config/large-clusters.md +++ b/docs/admin/config/large-clusters.md @@ -112,4 +112,4 @@ queueConfig: This [article](https://last9.io/blog/how-to-scale-prometheus-remote-write/){target=_blank} provides additional details and insight. -Also, note that this configuration enlarges the Prometheus queues and thus increases the required memory. It is hence suggested to reduce the metrics retention period as described [here](../runai-setup/cluster-setup/customize-cluster-install.md#configurations) +Also, note that this configuration enlarges the Prometheus queues and thus increases the required memory. It is hence suggested to reduce the metrics retention period as described [here](./advanced-cluster-config.md) diff --git a/docs/admin/config/node-roles.md b/docs/admin/config/node-roles.md index 4f5c4891d9..fac57c4ad5 100644 --- a/docs/admin/config/node-roles.md +++ b/docs/admin/config/node-roles.md @@ -31,7 +31,7 @@ runai-adm remove node-role --runai-system-worker !!! Important - To enable this feature, you must set the cluster configuration flag `global.nodeAffinity.restrictScheduling` to `true`. For more information see [customize cluster](../runai-setup/cluster-setup/customize-cluster-install.md#configurations). + To enable this feature, you must set the cluster configuration flag `global.nodeAffinity.restrictScheduling` to `true`. For more information see [customize cluster](./advanced-cluster-config.md). Separate nodes into those that: diff --git a/docs/admin/config/shared-storage.md b/docs/admin/config/shared-storage.md index 584cbcae64..c9d7cd84af 100644 --- a/docs/admin/config/shared-storage.md +++ b/docs/admin/config/shared-storage.md @@ -18,7 +18,7 @@ Run:ai [Data Sources](../../platform-admin/workloads/assets/datasources.md) supp Storage classes in Kubernetes defines how storage is provisioned and managed. This allows you to select storage types optimized for AI workloads. For example, you can choose storage with high IOPS (Input/Output Operations Per Second) for rapid data access during intensive training sessions, or tiered storage options to balance cost and performance-based on your organization’s requirements. This approach supports dynamic provisioning, enabling storage to be allocated on-demand as required by your applications. -Run:ai data sources such as [Persistent Volume Claims (PVC)](../../platform-admin/workloads/assets/existing-PVC.md) and [Data Volumes](../../platform-admin/workloads/assets/data-volumes.md) leverage storage class to manage and allocate storage efficiently. This ensures that the most suitable storage option is always accessible, contributing to the efficiency and performance of AI workloads. +Run:ai data sources such as [Persistent Volume Claims (PVC)](../../platform-admin/workloads/assets/datasources.md#pvc) and [Data Volumes](../../platform-admin/workloads/assets/data-volumes.md) leverage storage class to manage and allocate storage efficiently. This ensures that the most suitable storage option is always accessible, contributing to the efficiency and performance of AI workloads. !!! Note Run:ai lists all available storage classes in the Kubernetes cluster, making it easy for users to select the appropriate storage. Additionally, policies can be set to restrict or enforce the use of specific storage classes, to helpl maintain compliance with organizational standards and optimize resource utilization. diff --git a/docs/home/whats-new-2-13.md b/docs/home/whats-new-2-13.md index e7f746e333..67e24d9247 100644 --- a/docs/home/whats-new-2-13.md +++ b/docs/home/whats-new-2-13.md @@ -105,7 +105,7 @@ The association between workspaces and node pools is done using *Compute resourc **PVC data sources** -* Added support for PVC block storage in the *New data source* form. In the *New data source* form for a new PVC data source, in the *Volume mode* field, select from *Filesystem* or *Block*. For more information, see [Create a PVC data source](../Researcher/workloads/assets/datasources.md#create-a-pvc-data-source). +* Added support for PVC block storage in the *New data source* form. In the *New data source* form for a new PVC data source, in the *Volume mode* field, select from *Filesystem* or *Block*. For more information, see [Create a PVC data source](../Researcher/workloads/assets/datasources.md#pvc). **Credentials** diff --git a/docs/home/whats-new-2-17.md b/docs/home/whats-new-2-17.md index bb51c72d3e..c7286a14a5 100644 --- a/docs/home/whats-new-2-17.md +++ b/docs/home/whats-new-2-17.md @@ -41,7 +41,7 @@ date: 2024-Apr-14 #### Assets -* Added the capability to use a ConfigMap as a data source. The ability to use a ConfigMap as a data source can be configured in the *Data sources* UI, the CLI, and as part of a policy. For more information, see [Setup a ConfigMap as a data source](../Researcher/workloads/assets/datasources.md#create-a-configmap-data-source), [Setup a ConfigMap as a volume using the CLI](../Researcher/cli-reference/runai-submit.md#-configmap-volume-namepath). +* Added the capability to use a ConfigMap as a data source. The ability to use a ConfigMap as a data source can be configured in the *Data sources* UI, the CLI, and as part of a policy. For more information, see [Setup a ConfigMap as a data source](../Researcher/workloads/assets/datasources.md#configmap), [Setup a ConfigMap as a volume using the CLI](../Researcher/cli-reference/runai-submit.md#-configmap-volume-namepath). * Added a *Status* column to the *Credentials* table, and the *Data sources* table. The *Status* column displays the state of the resource and provides troubleshooting information about that asset. For more information, see the [Credentials table](../platform-admin/workloads/assets/credentials.md#credentials-table) and the [Data sources table](../Researcher/workloads/assets/datasources.md#data-sources-table). diff --git a/docs/home/whats-new-2-18.md b/docs/home/whats-new-2-18.md index 8cb75aa47d..61e776626c 100644 --- a/docs/home/whats-new-2-18.md +++ b/docs/home/whats-new-2-18.md @@ -77,7 +77,7 @@ date: 2024-June-14 For more information, see [Data Volumes](../platform-admin/workloads/assets/data-volumes.md). (Requires minimum cluster version v2.18). -* Added new data source of type *Secret*. Run:ai now allows you to configure a *Credential* as a data source. A *Data source* of type *Secret* is best used in workloads so that access to 3rd party interfaces and storage used in containers, keep access credentials hidden. For more information, see [Secrets as a data source](../Researcher/workloads/assets/datasources.md#create-a-secret-as-data-source). +* Added new data source of type *Secret*. Run:ai now allows you to configure a *Credential* as a data source. A *Data source* of type *Secret* is best used in workloads so that access to 3rd party interfaces and storage used in containers, keep access credentials hidden. For more information, see [Secrets as a data source](../Researcher/workloads/assets/datasources.md#secret). * Updated the logic of data source initializing state which keeps the workload in “initializing” status until S3 data is fully mapped. For more information see [Sidecar containers documentation](https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/). diff --git a/docs/platform-admin/workloads/assets/credentials.md b/docs/platform-admin/workloads/assets/credentials.md index 8cfec455b8..e3276e72ca 100644 --- a/docs/platform-admin/workloads/assets/credentials.md +++ b/docs/platform-admin/workloads/assets/credentials.md @@ -165,13 +165,13 @@ You can use credentials (secrets) in various ways within the system ### Access private data sources -To access private data sources, attach credentials to data sources of the following types: [Git](./datasources.md#create-a-git-data-source), [S3 Bucket](./datasources.md#create-an-s3-data-source) +To access private data sources, attach credentials to data sources of the following types: [Git](./datasources.md#git), [S3 Bucket](./datasources.md#s3-bucket) ### Use directly within the container To use the secret directly from within the container, you can choose between the following options -1. Get the secret mounted to the file system by using the [Generic secret](./datasources.md#create-a-secret-as-data-source) data source +1. Get the secret mounted to the file system by using the [Generic secret](./datasources.md#secret) data source 2. Get the secret as an environment variable injected into the container. There are two equivalent ways to inject the environment variable. a. By adding it to the Environment asset. b. By adding it ad-hoc as part of the workload. diff --git a/docs/platform-admin/workloads/assets/datasources.md b/docs/platform-admin/workloads/assets/datasources.md index 6dc510ce28..2d74cb6cdd 100644 --- a/docs/platform-admin/workloads/assets/datasources.md +++ b/docs/platform-admin/workloads/assets/datasources.md @@ -1,134 +1,219 @@ -## Introduction - - -A _data source_ is a location where data sets relevant to the research are stored. Workspaces can be attached to several data sources for reading and writing. The data can be located locally or in the cloud. Run:ai data sources can use a variety of storage technologies such as Git, S3, NFS, PVC, and more. - -The data source is an __optional__ building block for the creation of a workspace. - -![](img/8-ds-types.png "Data source types") - -## Create a new data source - -When you select `New Compute Resource` you will be presented with various data source options described below. - -### Create an NFS data source - -To create an NFS data source, provide: - -* A data source name. -* A Run:ai scope (cluster, department, or project) which is assigned to that item and all its subsidiaries. -* An NFS server. -* The path to the data within the server. -* The path within the container where the data will be mounted (the workload creator is able to override this when submitting the workload). - -The data can be set as read-write or limited to read-only permission regardless of any other user privileges. - -### Create a PVC data source - -To create an PVC data source, provide: - -* A data source name -* A Run:ai scope (cluster, department, or project) which is assigned to that item and all its subsidiaries. -* Select an existing PVC or create a new one by providing: - - * A claim name - * A storage class - * Access mode - * Required storage size - * Volume system mode - -* The path within the container where the data will be mounted (the workload creator is able to override this when submitting the workload). - -You can see the status of the resources created in the [Data sources table](#data-sources-table). - -### Create an S3 data source - -S3 storage saves data in *buckets*. S3 is typically attributed to AWS cloud service but can also be used as a separate service unrelated to Amazon. - -To create an S3 data source, provide - -* A data source name -* A Run:ai scope (cluster, department, or project) which is assigned to that item and all its subsidiaries. -* The relevant S3 service URL server -* The bucket name of the data. -* The path within the container where the data will be mounted (the workload creator is able to override this when submitting the workload). - -An S3 data source can be public or private. For the latter option, please select the relevant credentials associated with the project to allow access to the data. S3 buckets that use credentials will have a status associated with it. For more information, see [Data sources table](#data-sources-table). - -### Create a Git data source - -To create a Git data source, provide: - -* A data source name. -* A Run:ai scope (cluster, department, or project) which is assigned to that item and all its subsidiaries. -* The relevant repository URL. -* The path within the container where the data will be mounted (the workload creator is able to override this when submitting the workload). - -The Git data source can be public or private. To allow access to a private Git data source, you must select the relevant credentials associated with the project. Git data sources that use credentials will have a status associated with it. For more information, see [Data sources table](#data-sources-table). - -### Create a host path data source - -To create a host path data source, provide: - -* A data source name. -* A Run:ai scope (cluster, department, or project) which is assigned to that item and all its subsidiaries. -* The relevant path on the host. -* The path within the container where the data will be mounted (the workload creator is able to override this when submitting the workload). +This article explains what data sources are and how to create and use them. + +Data sources are a type of [workload asset](./overview.md) and represent a location where data is actually stored. They may represent a remote data location, such as NFS, Git, or S3, or a Kubernetes local resource, such as PVC, ConfigMap, HostPath, or Secret. + +This configuration simplifies the mapping of the data into the workload’s file system and handles the mounting process during workload creation for reading and writing. These data sources are reusable and can be easily integrated and used by AI practitioners while submitting workloads across various scopes. + +## Data sources table + +The data sources table can be found under __Data sources__ in the Run:ai platform. + +The data sources table provides a list of all the data sources defined in the platform and allows you to manage them. + +![](img/data-source-table.png) + +The data sources table comprises the following columns: + +| Column | Description | +| --- | --- | +| Data source | The name of the data source | +| Description | A description of the data source | +| Type | The type of data source connected – e.g., S3 bucket, PVC, or others | +| Status | The different lifecycle phases and representation of the data source condition | +| Scope | The [scope](./overview.md#asset-scope) of the data source within the organizational tree. Click the scope name to view the organizational tree diagram | +| Workload(s) | The list of existing workloads that use the data source | +| Template(s) | The list of workload templates that use the data source | +| Created by | The user who created the data source | +| Creation time | The timestamp for when the data source was created | +| Cluster | The cluster that the data source is associated with | + +### Data sources status + +The following table describes the data sources' condition and whether they were created successfully for the selected [scope](./overview.md#asset-scope). + +| Status | Description | +| --- | --- | +| No issues found | No issues were found while creating the data source | +| Issues found | Issues were found while propagating the data source credentials | +| Issues found | The cluster could not be accessed | +| Creating… | The data source is being created | +| No status / “-” | When the data source’s scope is an account, the current version of the cluster is not up to date, or the asset is not a cluster-syncing entity, the status can’t be displayed | + +### Customizing the table view + +* Filter - Click ADD FILTER, select the column to filter by, and enter the filter values +* Search - Click SEARCH and type the value to search by +* Sort - Click each column header to sort by +* Column selection - Click COLUMNS and select the columns to display in the table +* Download table - Click MORE and then click ‘Download as CSV’ +* Refresh - Click REFRESH to update the table with the latest data + +## Adding a new data source + +To create a new data source: + +1. Click __+NEW DATA SOURCE__ +2. Select the data source type from the list. Follow the step-by-step guide for each data source type: + +### NFS + +A Network File System ([NFS](https://kubernetes.io/docs/concepts/storage/volumes/#nfs){target=_blank}) is a Kubernetes concept used for sharing storage in the cluster among different pods. Like a PVC, the NFS volume’s content remains preserved, even outside the lifecycle of a single pod. However, unlike PVCs, which abstract storage management, NFS provides a method for network-based file sharing. The NFS volume can be pre-populated with data and can be mounted by multiple pod writers simultaneously. At Run:ai, an NFS-type data source is an abstraction that is mapped directly to a Kubernetes NFS volume. This integration allows multiple workloads under various scopes to mount and present the NFS data source. + +1. Select the __cluster__ under which to create this data source +2. Select a [scope](./overview.md#asset-scope) +3. Enter a name for the data source. The name must be unique. +4. Optional: Provide a __description__ of the data source +5. Set the data origin + * Enter the __NFS server__ (host name or host IP) + * Enter the __NFS path__ +6. Set the data target location + * __Container path__ +7. Optional: Restrictions + * __Prevent data modification__ - When enabled, the data will be mounted with read-only permissions +8. Click __CREATE DATA SOURCE__ + +### PVC + +A Persistent Volume Claim ([PVC](https://kubernetes.io/docs/concepts/storage/persistent-volumes/){target=_blank}) is a Kubernetes concept used for managing storage in the cluster, which can be provisioned by an administrator or dynamically by Kubernetes using a StorageClass. PVCs allow users to request specific sizes and access modes (read/write once, read-only many). At Run:ai, a PVC-type data source is an abstraction that is mapped directly to a Kubernetes PVC. By leveraging PVCs as data sources, Run:ai enables access to persistent storage for workloads, ensuring that data remains consistent and accessible across various scopes and workloads, beyond the lifecycle of individual pods. This ensures that data generated by AI workloads is not lost when pods are rescheduled or updated, providing a seamless and efficient storage solution that can handle the large datasets typically associated with AI projects. + +1. Select the __cluster__ under which to create this data source +2. Select a [scope](./overview.md#asset-scope) +3. Enter a __name__ for the data source. The name must be unique. +4. Optional: Provide a __description__ of the data source +5. Select PVC: + * __Existing PVC__ + This option is relevant when the purpose is to create a PVC-type data source based on an existing PVC in the cluster + * Select a PVC from the list - (The list is empty if no existing PVCs were [created in advance](../../../admin/config/create-k8s-assets-in-advance.md#creating-pvcs-in-advance)) + * __New PVC__ - creates a new PVC in the cluster. New PVCs are not added to the Existing PVCs list. + When creating a PVC-type data source and selecting the ‘New PVC’ option, the PVC is immediately created in the cluster (even if no workload has requested this PVC). +6. Select the __storage class__ + * __None__ - Proceed without defining a storage class + * __Custom storage class__ - This option applies when selecting a storage class based on existing storage classes. + To add new storage classes to the storage class list, and for additional information, check [Kubernetes storage classes](../../../admin/config/shared-storage.md#kubernetes-storage-classes) +7. Select the __access mode(s)__ (multiple modes can be selected) + * __Read-write by one node__ - The volume can be mounted as read-write by a single node. + * __Read-only by many nodes__ - The volume can be mounted as read-only by many nodes. + * __Read-write by many nodes__ - The volume can be mounted as read-write by many nodes. +8. Set the __claim size__ and its __units__ +9. Select the __volume mode__ +10. Set the data target location + * __container path__ +11. Optional: __Prevent data modification__ - When enabled, the data will be mounted with read-only permission. +12. Click __CREATE DATA SOURCE__ + +After the data source is created, check its status to monitor its proper creation across the selected scope. + +### S3 Bucket + +The [S3 bucket](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-s3-bucket.html){target=_blank} data source enables the mapping of a remote S3 bucket into the workload’s file system. Similar to a PVC, this mapping remains accessible across different workload executions, extending beyond the lifecycle of individual pods. However, unlike PVCs, data stored in an S3 bucket resides remotely, which may lead to decreased performance during the execution of heavy machine learning workloads. As part of the Run:ai connection to the S3 bucket, you can create [credentials](./credentials.md) in order to access and map private buckets. + +1. Select the __cluster__ under which to create this data source +2. Select a [scope](./overview.md#asset-scope) +3. Enter a __name__ for the data source. The name must be unique. +4. Optional: Provide a __description__ of the data source +5. Set the data origin + * Set the __S3 service URL__ + * Select the __credentials__ + * __None__ - for public buckets + * __Credential names__ - This option is relevant for private buckets based on existing credentials that were created for the scope. + To add new credentials to the credentials list, and for additional information, check the [Credentials](./credentials.md) article. + * Enter the __bucket name__ +6. Set the data target location + * __container path__ +7. Click __CREATE DATA SOURCE__ + +After a private data source is created, check its status to monitor its proper creation across the selected scope. + +### Git + +A Git-type data source is a Run:ai integration, that enables code to be copied from a Git branch into a dedicated folder in the container. It is mainly used to provide the workload with the latest code repository. As part of the integration with Git, in order to access private repositories, you can add predefined credentials to the data source mapping. + +1. Select the __cluster__ under which to create this data source +2. Select a [scope](./overview.md#asset-scope) +3. Enter a __name__ for the data source. The name must be unique. +4. Optional: Provide a __description__ of the data source +5. Set the data origin + * Set the __Repository URL__ + * Set the __Revision__ (branch, tag, or hash)- If left empty, it will use the 'HEAD' (latest) + * Select the __credentials__ + * __None__ - for public repositories + * __Credential names__ - This option applies to private repositories based on existing credentials that were created for the scope. + To add new credentials to the credentials list, and for additional information, check the [Credentials](./credentials.md) article. +6. Set the data target location + * __container path__ +7. Click __CREATE DATA SOURCE__ + +After a private data source is created, check its status to monitor its proper creation across the selected scope. + +### Host path + +A [Host path](https://kubernetes.io/docs/concepts/storage/volumes/#hostpath){target=_blank} volume is a Kubernetes concept that enables mounting a host path file or a directory on the workload’s file system. Like a PVC, the host path volume’s data persists across workloads under various scopes. It also enables data serving from the hosting node. + +1. Select the __cluster__ under which to create this data source +2. Select a [scope](./overview.md#asset-scope) +3. Enter a __name__ for the data source. The name must be unique. +4. Optional: Provide a __description__ of the data source +5. Set the data origin + * __host path__ +6. Set the data target location + * __container path__ +7. Optional: __Prevent data modification__ - When enabled, the data will be mounted with read-only permissions. +8. Click __CREATE DATA SOURCE__ + +### ConfigMap + +A [ConfigMap](https://kubernetes.io/docs/concepts/configuration/configmap/){target=_blank} data source is a Run:ai abstraction for the Kubernetes ConfigMap concept. The ConfigMap is used mainly for storage that can be mounted on the workload container for non-confidential data. It is usually represented in key-value pairs (e.g., environment variables, command-line arguments etc.). It allows you to decouple environment-specific system configurations from your container images, so that your applications are easily portable. ConfigMaps must be created on the cluster prior to being used within the Run:ai system. + +1. Select the __cluster__ under which to create this data source +2. Select a [scope](./overview.md#asset-scope) +3. Enter a __name__ for the data source. The name must be unique. +4. Optional: Provide a __description__ of the data source +5. Set the data origin + * Select the __ConfigMap name__ (The list is empty if no existing ConfigMaps were [created in advance](../../../admin/config/create-k8s-assets-in-advance.md#creating-configmaps-in-advance)). +6. Set the data target location + * __container path__ +7. Click __CREATE DATA SOURCE__ + +### Secret + +A secret-type data source enables the mapping of a credential into the workload’s file system. [Credentials](./credentials.md) are a workload asset that simplify the complexities of Kubernetes [Secrets](https://kubernetes.io/docs/concepts/configuration/secret/){target=_blank}. The credentials mask sensitive access information, such as passwords, tokens, and access keys, which are necessary for gaining access to various resources. + +1. Select the __cluster__ under which to create this data source +2. Select a [scope](./overview.md#asset-scope) +3. Enter a __name__ for the data source. The name must be unique. +4. Optional: Provide a __description__ of the data source +5. Set the data origin + * Select the __credentials__ + To add new credentials, and for additional information, check the [Credentials](./credentials.md) article. +6. Set the data target location + * __container path__ +7. Click __CREATE DATA SOURCE__ + +After the data source is created, check its status to monitor its proper creation across the selected scope. + +!!! Note + It is also possible to add data sources directly when creating a specific workspace, training or inference workload + +## Editing a data source + +To edit a data source: + +1. Select the data source from the table +2. Click __Rename__ to provide it with a new name +3. Click __Copy & Edit__ to make any changes to the data source + +## Deleting a data source + +To delete a data source: + +1. Select the data source you want to delete +2. Click __DELETE__ +3. Confirm you want to delete the data source !!! Note - The data can be limited to read-only permission regardless of any other user privileges. - -### Create a ConfigMap data source - -* A Run:ai project scope which is assigned to that item and all its subsidiaries. - -ConfigMaps must be created on the cluster before being used within Run:ai. When created, the ConfigMap must have a label of `run.ai/resource: `. The resource name specified must be unique to that created resource. - -* A data source name. -* A data mount consisting of: - - * A ConfigMap name—select from the drop down. - * A target location—the path to the container (the workload creator is able to override this when submitting the workload). - -### Create a Secret as data source - -* A Run:ai project scope which is assigned to that item and all its subsidiaries. -* A *Credentials*. To create a new *Credentials*, see [Configuring Credentials](credentials.md) - -* A data source name and description. -* A data mount consisting of: - - * A *Credentials*—select from the drop down. - * A target location—the path to the container (the workload creator is able to override this when submitting the workload). - -### Data sources table - -The *Data sources* table contains a column for the status of the data source. The following statuses are supported: - -| Status | Description | -| -- | -- | -| **No issues found** | No issues were found when propagating the data source to the *PROJECTS*. | -| **Issues found** | Failed to create the data source for some or all of the *PROJECTS*. | -| **Issues found** | Failed to access the cluster. | -| **Deleting** | The data source is being removed. | - -!!! Note - - * The *Status* column in the table shows statuses based on your level of permissions. For example, a user that has create permissions for the scope, will see statuses that are calculated from the entire scope, while users who have only view and use permissions, will only be able to see statuses from a subset of the scope (assets that they have permissions to). - * The status of “-” indicates that there is no status because this asset is not cluster-syncing. - -You can download the Data Sources table to a CSV file. Downloading a CSV can provide a snapshot history of your Data Sources over the course of time, and help with compliance tracking. All the columns that are selected (displayed) in the table will be downloaded to the file. - -Use the *Cluster* filter at the top of the table to see data sources that are assigned to specific clusters. - -!!! Note - The cluster filter will be in the top bar when there are clusters that are installed with version 2.16 or lower. - -Use the *Add filter* to add additional filters to the table. + It is not possible to delete an environment being used by an existing workload or template. -To download the Data Sources table to a CSV: +## Using API -1. Open *Data Sources*. -2. From the *Columns* icon, select the columns you would like to have displayed in the table. -3. Click on the ellipsis labeled *More*, and download the CSV. +To view the available actions, go to the [Data sources](https://app.run.ai/api/docs#tag/Datasources) API reference. \ No newline at end of file diff --git a/docs/platform-admin/workloads/assets/img/8-ds-types.png b/docs/platform-admin/workloads/assets/img/8-ds-types.png deleted file mode 100644 index c6dbabde6a..0000000000 Binary files a/docs/platform-admin/workloads/assets/img/8-ds-types.png and /dev/null differ diff --git a/docs/platform-admin/workloads/assets/img/data-source-table.png b/docs/platform-admin/workloads/assets/img/data-source-table.png new file mode 100644 index 0000000000..3cd473bddc Binary files /dev/null and b/docs/platform-admin/workloads/assets/img/data-source-table.png differ diff --git a/graveyard/create-compute.md b/graveyard/create-compute.md index 29bb75a6cf..ea2c24d6be 100644 --- a/graveyard/create-compute.md +++ b/graveyard/create-compute.md @@ -20,12 +20,12 @@ GPU resources can be expressed in various ways: 1. Request GPU devices: this option supports whole GPUs (for example, 1 GPU, 2 GPUs, 3 GPUs) or a fraction of GPU (for example, 0.1 GPU, 0.5 GPU, 0.93 GPU, etc.) 2. Request partial memory of a single GPU device: this option allows to explicitly state the amount of memory needed (for example, 5GB GPU RAM). -3. Request a MIG profile: this option will dynamically provision the requested [MIG profile](../../../scheduling/fractions.md#dynamic-mig) (if the relevant hardware exists). +3. Request a MIG profile: this option will dynamically provision the requested MIG profile (if the relevant hardware exists). !!! Note * Selecting a GPU fraction (for example, 0.5 GPU) in a heterogeneous cluster may result in inconsistent results: For example, half of a V100 16GB GPU memory is different than A100 with 40GB). In such scenarios. Requesting specific GPU memory is a better strategy. - * When selecting partial memory of a single GPU device, if NVIDIA MIG is enabled on a node, then the memory can be provided as a MIG profile. For more information see [Dynamic MIG](../../../scheduling/fractions.md#dynamic-mig). + * When selecting partial memory of a single GPU device, if NVIDIA MIG is enabled on a node, then the memory can be provided as a MIG profile. * If GPUs are not requested, they will not be allocated even if resources are available. In that case, the project's GPU quota will not be affected. ## Set CPU resources diff --git a/docs/platform-admin/workloads/assets/existing-PVC.md b/graveyard/existing-PVC.md similarity index 100% rename from docs/platform-admin/workloads/assets/existing-PVC.md rename to graveyard/existing-PVC.md diff --git a/graveyard/submitting-workloads.md b/graveyard/submitting-workloads.md index 3327e55365..9aeb59226f 100644 --- a/graveyard/submitting-workloads.md +++ b/graveyard/submitting-workloads.md @@ -34,7 +34,7 @@ To submit a workload using the UI: 5. Enter the *Container path* for volume target location. 6. Select a *Volume persistency*. - 7. In the *Data sources* pane, select a data source. If you need a new data source, press *add a new data source*. For more information, see [Creating a new data source](../../Researcher/workloads/assets/datasources.md#create-a-new-data-source) When complete press, *Create Data Source*. + 7. In the *Data sources* pane, select a data source. If you need a new data source, press *add a new data source*. For more information, see [Creating a new data source](../../Researcher/workloads/assets/datasources.md#adding-a-new-data-source) When complete press, *Create Data Source*. !!! Note * Data sources that have private credentials, which have the status of *issues found*, will be greyed out. @@ -77,7 +77,7 @@ To submit a workload using the UI: 5. Enter the *Container path* for volume target location. 6. Select a *Volume persistency*. Choose *Persistent* or *Ephemeral*. - 8. (Optional) In the *Data sources* pane, select a data source. If you need a new data source, press *add a new data source*. For more information, see [Creating a new data source](./assets/datasources.md#create-a-new-data-source) When complete press, *Create Data Source*. + 8. (Optional) In the *Data sources* pane, select a data source. If you need a new data source, press *add a new data source*. For more information, see [Creating a new data source](./assets/datasources.md#adding-a-new-data-source) When complete press, *Create Data Source*. !!! Note * Data sources that have private credentials, which have the status of *issues found*, will be greyed out. @@ -108,7 +108,7 @@ To submit a workload using the UI: 5. Enter the *Container path* for volume target location. 6. Select a *Volume persistency*. Choose *Persistent* or *Ephemeral*. - 4. (Optional) In the *Data sources* pane, select a data source. If you need a new data source, press *add a new data source*. For more information, see [Creating a new data source](assets/datasources.md#create-a-new-data-source) When complete press, *Create Data Source*. + 4. (Optional) In the *Data sources* pane, select a data source. If you need a new data source, press *add a new data source*. For more information, see [Creating a new data source](assets/datasources.md#adding-a-new-data-source) When complete press, *Create Data Source*. !!! Note * Data sources that have private credentials, which have the status of *issues found*, will be greyed out. diff --git a/mkdocs.yml b/mkdocs.yml index 67b757b52c..c030d06272 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -140,7 +140,7 @@ plugins: 'Researcher/user-interface/workspaces/blocks/environments.md' : 'Researcher/workloads/assets/environments.md' 'Researcher/user-interface/workspaces/blocks/compute.md' : 'Researcher/workloads/assets/compute.md' 'Researcher/user-interface/workspaces/blocks/datasources.md' : 'Researcher/workloads/assets/datasources.md' - 'Researcher/user-interface/workspaces/blocks/existing-PVC.md' : 'Researcher/workloads/assets/existing-PVC.md' + 'Researcher/user-interface/workspaces/blocks/existing-PVC.md' : 'Researcher/workloads/assets/datasources.md' 'Researcher/user-interface/workspaces/create/create-env.md' : 'Researcher/workloads/assets/environments.md' 'Researcher/user-interface/workspaces/create/create-compute.md' : 'Researcher/workloads/assets/compute.md' 'Researcher/user-interface/workspaces/create/create-ds.md' : 'Researcher/workloads/assets/datasources.md' @@ -176,6 +176,7 @@ plugins: 'Administrator/integration/ray.md' : 'platform-admin/workloads/integrations.md' 'platform-admin/workloads/assets/secrets.md' : 'Researcher/best-practices/secrets-as-env-var-in-cli.md' 'admin/runai-setup/access-control/rbac.md' : 'admin/authentication/roles.md' + 'platform-admin/workloads/assets/existing-PVC.md' : 'platform-admin/workloads/assets/datasources.md' nav: - Home: - 'Overview': 'home/overview.md' @@ -255,6 +256,8 @@ nav: - 'Group Nodes' : 'admin/config/limit-to-node-group.md' - 'Workload Deletion Protection' : 'admin/config/workload-ownership-protection.md' - 'Advanced Cluster Configuration' : 'admin/config/advanced-cluster-config.md' + - 'Mark Assets for Run:ai' : 'admin/config/create-k8s-assets-in-advance.md' + - 'Maintenance' : - 'Node Maintenance' : 'admin/maintenance/node-downtime.md' - 'System Monitoring' : 'admin/maintenance/alert-monitoring.md' @@ -302,9 +305,7 @@ nav: - 'Overview' : 'platform-admin/workloads/assets/overview.md' - 'Environments' : 'platform-admin/workloads/assets/environments.md' - 'Compute Resources': 'platform-admin/workloads/assets/compute.md' - - 'Data Sources' : - - 'Overview' : 'platform-admin/workloads/assets/datasources.md' - - 'PVC Data Source' : 'platform-admin/workloads/assets/existing-PVC.md' + - 'Data Sources' : 'platform-admin/workloads/assets/datasources.md' - 'Templates': 'platform-admin/workloads/assets/templates.md' - 'Credentials' : 'platform-admin/workloads/assets/credentials.md' - 'Data Volumes': 'platform-admin/workloads/assets/data-volumes.md' @@ -347,9 +348,7 @@ nav: - 'Overview' : 'Researcher/workloads/assets/overview.md' - 'Environments' : 'Researcher/workloads/assets/environments.md' - 'Compute Resources': 'Researcher/workloads/assets/compute.md' - - 'Data Sources' : - - 'Overview' : 'Researcher/workloads/assets/datasources.md' - - 'PVC Data Source' : 'Researcher/workloads/assets/existing-PVC.md' + - 'Data Sources' : 'Researcher/workloads/assets/datasources.md' - 'Templates': 'Researcher/workloads/assets/templates.md' - 'Credentials' : 'Researcher/workloads/assets/credentials.md' - 'Data Volumes': 'Researcher/workloads/assets/data-volumes.md'