Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add instructions for manually configuring an Azure Batch pool #325

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
160 changes: 135 additions & 25 deletions platform_versioned_docs/version-24.2/compute-envs/azure-batch.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -144,15 +144,21 @@ To create an access key:
- Add the **Batch account** and **Blob Storage account** names and access keys to the relevant fields.
1. Delete the copied keys from their temporary location after they have been added to a credential in Platform.

#### Entra service principal
#### Entra service principal and managed identity

If using Entra for authentication, you must create a service principal and managed identity. Seqera Platform uses the Service Principal to authenticate to Azure Batch and Azure Storage. It submits a Nextflow task as the head process to run Nextflow, which authenticates to Azure Batch and storage using the Managed Identity attached to the node pool.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ Should there be links out to MSFT-managed docs explaining SPs? Oh I see, it's described in a section above. Point still holds true (just for the earlier description). Nevermind, I see you have it below.

⛏️ I'd probably frontload the explanations, but this works too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a glossary at the top and an explanation on the different pathways further up, I thought this would be enough early info before going into details.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ SP type qualification necessary? (i.e. user-managed). Nevermind, I see this is handled below.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ Do our docs adequately cover how to attach an SP to an Azure Batch pool (I haven't checked)? If yes, does this cover both Manual and Forge flows (assuming both are applicable)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's pretty much what this PR is for.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If using Entra for authentication, you must create a service principal and managed identity. Seqera Platform uses the Service Principal to authenticate to Azure Batch and Azure Storage. It submits a Nextflow task as the head process to run Nextflow, which authenticates to Azure Batch and storage using the Managed Identity attached to the node pool.
If using Entra for authentication, you must create a service principal and managed identity. Seqera Platform uses the service principal to authenticate to Azure Batch and Azure Storage. It submits a Nextflow task as the head process to run Nextflow, which authenticates to Azure Batch and storage using the managed identity attached to the node pool.


Therefore, you must create both an Entra service principal and a managed identity. You add the service principal to your Seqera Platform credentials and attach the managed identity to your Azure Batch node pool which will run Nextflow.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this correctly, two different MSFT identities need to be used:

  1. Service Principle type -- which can accommodate the fact that Tower lives outside the Azure network and likely is calling across the public internet.

  2. Managed Identity -- can be used since the Pool lives inside the account and thus has a greater assurance level.

Correct?

Yes, explained below:

When you use a manually configured compute environment with a managed identity attached to the Azure Batch Pool, Nextflow can use this managed identity for authentication. However, Platform still needs to use access keys or an Entra service principal to submit the initial task to Azure Batch to run Nextflow, which will then proceed with the managed identity for subsequent authentication.


:::info
Batch Forge compute environments must use access keys for authentication. Service principals are only supported in manual compute environments.

The use of Entra service principals in manual compute environments requires the use of a [managed identity](#managed-identity).
:::

See [Create a service principal][az-create-sp] for more details.
##### Service principal

See [Create a service principal][az-create-sp] for more details.

To create an Entra service principal:

Expand All @@ -173,13 +179,61 @@ To create an Entra service principal:
- Complete the remaining fields: **Batch account name**, **Blob Storage account name**, **Tenant ID** (Application (client) ID in Azure), **Client ID** (Client secret ID in Azure), **Client secret** (Client secret value in Azure).
1. Delete the ID and secret values from their temporary location after they have been added to a credential in Platform.

## Platform compute environment
##### Managed identity

:::info
To use managed identities, Platform requires requires Nextflow version 24.06.0-edge or later.
adamrtalbot marked this conversation as resolved.
Show resolved Hide resolved
:::

Nextflow can authenticate to Azure services using a managed identity. This method offers enhanced security compared to access keys, but must run on Azure infrastructure.

When you use a manually configured compute environment with a managed identity attached to the Azure Batch Pool, Nextflow can use this managed identity for authentication. However, Platform still needs to use access keys or an Entra service principal to submit the initial task to Azure Batch to run Nextflow, which will then proceed with the managed identity for subsequent authentication.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When you use a manually configured compute environment with a managed identity attached to the Azure Batch Pool, Nextflow can use this managed identity for authentication. However, Platform still needs to use access keys or an Entra service principal to submit the initial task to Azure Batch to run Nextflow, which will then proceed with the managed identity for subsequent authentication.
When you use a manually-configured compute environment with a managed identity attached to the Azure Batch Pool, Nextflow can use this managed identity for authentication. However, Platform still needs to use access keys or an Entra service principal to submit the initial task to Azure Batch to run Nextflow, which will then proceed with the managed identity for subsequent authentication.


1. In Azure, create a user-assigned managed identity. See [Manage user-assigned managed identities](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-manage-user-assigned-managed-identities) for detailed steps. After creation, record the Client ID of the managed identity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. In Azure, create a user-assigned managed identity. See [Manage user-assigned managed identities](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-manage-user-assigned-managed-identities) for detailed steps. After creation, record the Client ID of the managed identity.
1. In Azure, create a user-assigned managed identity. See [Manage user-assigned managed identities](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-manage-user-assigned-managed-identities) for detailed steps. When it's been created, take note of the client ID of the managed identity.

1. The user-assigned managed identity must have the necessary access roles for Nextflow. See [Required role assignments](https://www.nextflow.io/docs/latest/azure.html#required-role-assignments) for more information.
1. Associate the user-assigned managed identity with the Azure Batch Pool. See [Set up managed identity in your Batch pool](https://learn.microsoft.com/en-us/troubleshoot/azure/hpc/batch/use-managed-identities-azure-batch-account-pool#set-up-managed-identity-in-your-batch-pool) for more information.
1. When you set up the Platform compute environment, select the Azure Batch pool by name and enter the managed identity client ID in the specified field as instructed above.

When you submit a pipeline to this compute environment, Nextflow will authenticate using the managed identity associated with the Azure Batch node it runs on, rather than relying on access keys.

## Add Platform compute environment

There are two ways to create an Azure Batch compute environment in Seqera Platform:

- [**Batch Forge**](#tower-forge): Automatically creates Azure Batch resources.
- [**Manual**](#manual): For using existing Azure Batch resources.

### VM size considerations

Azure Batch requires you to select an appropriate VM size for your compute environment(s). There are a number of considerations when selecting an appropriate VM size for your compute environment(s). Please see the Azure documentation for [virtual machine sizes][az-vm-sizes] for more information.

1. **Family**: The first letter of the VM size name indicates the family of the machine. For example, `Standard_E16d_v5` is a member of the E family.
- *A*: Economical machines, low power machines
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- *A*: Economical machines, low power machines
- *A*: Economical machines, low power machines.

- *B*: Burstable machines which use credits for cost allocation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- *B*: Burstable machines which use credits for cost allocation
- *B*: Burstable machines which use credits for cost allocation.

- *D*: General purpose machines suitable for most applications
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- *D*: General purpose machines suitable for most applications
- *D*: General purpose machines suitable for most applications.

- *DC*: D machines with additional confidential compute capabilities
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- *DC*: D machines with additional confidential compute capabilities
- *DC*: D machines with additional confidential compute capabilities.

- *E*: The same as D but with more memory. These are generally the best machines for bioinformatics workloads.
- *EC*: The same as E but with additional confidential compute capabilities
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- *EC*: The same as E but with additional confidential compute capabilities
- *EC*: The same as E but with additional confidential compute capabilities.

- *F*: Compute optimized machines which come with a faster CPU compared to D-series machines.
- *M*: Memory optimized machines which come with extremely large and fast memory layers, typically more than is needed for bioinformatics workloads.
- *L*: Storage optimized machines which come with large locally attached NVME storage drives. Not these need to be configured before use with Azure Batch.
adamrtalbot marked this conversation as resolved.
Show resolved Hide resolved
- *N*: Accelerated computing machines which come with FPGAs, GPUs or custom ASICs.
- *H*: High performance machines which come with the fastest processors and memory

In general, we recommend using the E family of machines for bioinformatics workloads since these are cost effective, widely available and sufficiently fast.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In general, we recommend using the E family of machines for bioinformatics workloads since these are cost effective, widely available and sufficiently fast.
In general, we recommend using the E family of machines for bioinformatics workloads since these are cost-effective, widely available, and sufficiently fast.


1. **vCPUs**: The number of vCPUs the machine has. This is the main factor in determining the speed of the machine.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 Personally I like this level of detail, but there is a risk of it getting out of date. Docs probably will need to set themselves a recurring task to go check accuracy every so often.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the user must decide on the specific machine type, stronger guidance is required than something like AWS.

I've tried to keep it generic (families, features) that shouldn't change much but we will need to add new categories as they are included.

1. **features**: The additional features the machine has. For example, some machines come with a local SSD.

- d: The machine has a local storage disk. Azure Batch is able to use this disk automatically instead of the operating system disk.
- s: The VM size supports a [premium storage account][az-premium-storage]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- s: The VM size supports a [premium storage account][az-premium-storage]
- s: The VM size supports a [premium storage account][az-premium-storage].

- a: Using AMD chips instead of Intel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- a: Using AMD chips instead of Intel
- a: Using AMD chips instead of Intel.

- p: Using ARM based chips such as the Azure Cobalt chips
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- p: Using ARM based chips such as the Azure Cobalt chips
- p: Using ARM-based chips such as the Azure Cobalt chips.

- l: Reduced memory with a large cost reduction
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- l: Reduced memory with a large cost reduction
- l: Reduced memory with a large cost reduction.

1. **Version**: The version of the VM size. This is the generation of the machine. Typically, more recent is better but availability can vary between regions.

In the Azure Portal on the page for your Azure Batch account, be sure to request appropriate quota for your desired VM size. See the [Azure Batch service quotas and limits][az-batch-quotas] documentation for more details.

### Batch Forge

:::caution
Expand Down Expand Up @@ -250,15 +304,83 @@ Create a Batch Forge Azure Batch compute environment:
See [Launch pipelines](../launch/launchpad.mdx) to start executing workflows in your Azure Batch compute environment.
:::

## Manual
### Manual

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ I don't see Forge here. Is that an omission by accident, or can the SP / Managed Identity config only occur via Manual creation for now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only manual, the API for adding a managed identity to a compute pool doesn't work.


This section is for users with a pre-configured Azure Batch pool. This requires an existing Azure Batch account with an existing pool.
It is possible to set up Seqera Platform to use a pre-existing Azure Batch pool. This allows the use of more advanced Azure Batch features, such as custom VM images and private networking.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we link out to MSFT pages that explain the benefits of why you might want these features?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added now


:::caution
Your Seqera compute environment uses resources that you may be charged for in your Azure account. See [Cloud costs](../monitoring/cloud-costs.mdx) for guidelines to manage cloud resources effectively and prevent unexpected costs.
:::

Create a manual Seqera Azure Batch compute environment:
**Create a Nextflow compatible Azure Batch pool**

If not described below, please use the default settings.

1. **Account**: You must have an existing Azure Batch account. Ideally, you would already have demonstrated you can run an Azure Batch task within this account. Any type of account is compatible.
1. **Quota**: You must check you have sufficient quota for the number of pools, jobs and vCPUs per series. See [Azure Batch service quotas and limits][az-batch-quotas] for more information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. **Quota**: You must check you have sufficient quota for the number of pools, jobs and vCPUs per series. See [Azure Batch service quotas and limits][az-batch-quotas] for more information.
1. **Quota**: You must check you have sufficient quota for the number of pools, jobs, and vCPUs per series. See [Azure Batch service quotas and limits][az-batch-quotas] for more information.

1. On the Azure Batch page of the Azure Portal, select **Pools** and then **+ Add**.
1. **Name**: Enter a Pool ID and Display Name. The ID is the one we will refer to in the Seqera Platform and/or Nextflow.
1. **Identity**: Select **User assigned** to use a managed identity for the pool. Click the "Add" for User-assigned managed identity and select the Managed Identity with the correct permissions to the storage account and Batch account.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. **Identity**: Select **User assigned** to use a managed identity for the pool. Click the "Add" for User-assigned managed identity and select the Managed Identity with the correct permissions to the storage account and Batch account.
1. **Identity**: Select **User assigned** to use a managed identity for the pool. Select **Add** for the user-assigned managed identity and select the managed identity with the correct permissions to the storage account and Batch account.

1. **Operating System**: It is possible to use any Linux based image here, however we recommend using it with a Microsoft Azure Batch provided image. Note, there are two generations of Azure Virtual Machine images and certain VM series are only available in one generation. See [Azure Virtual Machine series][az-vm-gen] for more information. For default settings, please select the following:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. **Operating System**: It is possible to use any Linux based image here, however we recommend using it with a Microsoft Azure Batch provided image. Note, there are two generations of Azure Virtual Machine images and certain VM series are only available in one generation. See [Azure Virtual Machine series][az-vm-gen] for more information. For default settings, please select the following:
1. **Operating System**: You can use any Linux based image here, however we recommend using it with a Microsoft Azure Batch provided image. Note that there are two generations of Azure Virtual Machine images, and certain VM series are only available in one generation. See [Azure Virtual Machine series][az-vm-gen] for more information. For default settings, please select the following:

- **Publisher**: `microsoft-azure-batch`
- **Offer**: `ubuntu-server-container`
- **Sku**: `20.04 LTS`
- **Security type**: `standard`
1. **OS disk storage account type**: Certain VM series only support a specific storage account type. See [Azure managed disk types][az-disk-type] and [Azure Virtual Machine series][az-vm-gen] for more information. In general, a VM series with the suffix *s* will support *Premium LRS* storage account type, e.g. a `standard_e16ds_v5` will support `Premium_LRS` but a `standard_e16d_v5` will not. Premium LRS will offer the best performance.
1. **OS disk size**: The size of the OS disk in GB. This needs to be sufficient to hold every docker container the VM will run plus any logging or further files. If you are not using a machine with attached storage, you will need to increase this for task files (see VM type below). Assuming you are using a machine with attached storage, we can leave this to OS default size.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. **OS disk size**: The size of the OS disk in GB. This needs to be sufficient to hold every docker container the VM will run plus any logging or further files. If you are not using a machine with attached storage, you will need to increase this for task files (see VM type below). Assuming you are using a machine with attached storage, we can leave this to OS default size.
1. **OS disk size**: The size of the OS disk in GB. This needs to be sufficient to hold every Docker container the VM will run plus any logging or further files. If you are not using a machine with attached storage, you will need to increase this for task files (see VM type below). Assuming you are using a machine with attached storage, this can be left at the OS default size.

1. **Container configuration**: Container configuration must be turned on. Do this by switching it from **None** to **Custom**. The type is "Docker compatible" which should be the only available option. This will enable the VM to use Docker images and is sufficient, however we can add further options. Under **Container image names** we can add containers for the VM to grab at startup time. Add a list of fully qualified docker URIs e.g. `quay.io/seqeralabs/nf-launcher:j17-23.04.2`. Under **Container registries**, we can add any container registries which require additional authentication. Click **Container registries** then **Add**. Here you can add a registry username, password and Registry server. If you attached the Managed Identity earlier, you can select this as an authentication method which will allow you to avoid using a username and password.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. **Container configuration**: Container configuration must be turned on. Do this by switching it from **None** to **Custom**. The type is "Docker compatible" which should be the only available option. This will enable the VM to use Docker images and is sufficient, however we can add further options. Under **Container image names** we can add containers for the VM to grab at startup time. Add a list of fully qualified docker URIs e.g. `quay.io/seqeralabs/nf-launcher:j17-23.04.2`. Under **Container registries**, we can add any container registries which require additional authentication. Click **Container registries** then **Add**. Here you can add a registry username, password and Registry server. If you attached the Managed Identity earlier, you can select this as an authentication method which will allow you to avoid using a username and password.
1. **Container configuration**: Container configuration must be turned on. Do this by switching it from **None** to **Custom**. The type is **Docker compatible** which should be the only available option. This will enable the VM to use Docker images and is sufficient, however we can add further options: Under **Container image names** we can add containers for the VM to grab at startup time. Add a list of fully qualified Docker URIs e.g. `quay.io/seqeralabs/nf-launcher:j17-23.04.2`. Under **Container registries**, we can add any container registries that require additional authentication. Select **Container registries** then **Add**. Here, you can add a registry username, password, and registry server. If you attached the managed identity earlier, you can select this as an authentication method so you don't have to enter a username and password.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can do with more formatting, maybe sub-bullets?

1. **VM size**: This is the size of the VM. See [the section on Azure VM sizes][az-vm-sizes] for more information.
1. **Scale**: Azure Node pools can be fixed in size or autoscale based on a formula. We recommend autoscaling to enable scaling your resources down to zero when not in use. Click **Auto scale**. Change the **AutoScale evaluation interval** to 5 minutes, this is the minimum period between evaluations of the autoscale formula. For formula, you can use any valid formula, please see the documentation [here][az-batch-autoscale] for more information. This is the default autoscaling formula, with a maximum of 8 VMs:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. **Scale**: Azure Node pools can be fixed in size or autoscale based on a formula. We recommend autoscaling to enable scaling your resources down to zero when not in use. Click **Auto scale**. Change the **AutoScale evaluation interval** to 5 minutes, this is the minimum period between evaluations of the autoscale formula. For formula, you can use any valid formula, please see the documentation [here][az-batch-autoscale] for more information. This is the default autoscaling formula, with a maximum of 8 VMs:
1. **Scale**: Azure Node pools can be fixed in size or autoscale based on a formula. We recommend autoscaling to enable scaling your resources down to zero when not in use. Select **Auto scale** and change the **AutoScale evaluation interval** to 5 minutes - this is the minimum period between evaluations of the autoscale formula. For **Formula**, you can use any valid formula, please see the documentation [here][az-batch-autoscale] for more information. This is the default autoscaling formula, with a maximum of 8 VMs:


```
// Compute the target nodes based on pending tasks.
// $PendingTasks == The sum of $ActiveTasks and $RunningTasks
$samples = $PendingTasks.GetSamplePercent(interval);
adamrtalbot marked this conversation as resolved.
Show resolved Hide resolved
$tasks = $samples < 70 ? max(0, $PendingTasks.GetSample(1)) : max( $PendingTasks.GetSample(1), avg($PendingTasks.GetSample(interval)));
$targetVMs = $tasks > 0 ? $tasks : max(0, $TargetDedicatedNodes/2);
targetPoolSize = max(0, min($targetVMs, 8));

// For first interval deploy 1 node, for other intervals scale up/down as per tasks.
$TargetDedicatedNodes = targetPoolSize;
$NodeDeallocationOption = taskcompletion;
adamrtalbot marked this conversation as resolved.
Show resolved Hide resolved
```

1. **Start task**: This is the task that will run on each VM when it joins the pool. This can be used to install additional software on the VM. When using Batch Forge, this is used to install `azcopy` for staging files onto and off the node. Select **Enabled** and add the following command line to install `azcopy`:

```shell
bash -c "chmod +x azcopy && mkdir $AZ_BATCH_NODE_SHARED_DIR/bin/ && cp azcopy $AZ_BATCH_NODE_SHARED_DIR/bin/"
```

Click **Resource files** then select **Http url**. For the URL, add `https://nf-xpack.seqera.io/azcopy/linux_amd64_10.8.0/azcopy` and for File path type `azcopy`. Every other setting can be left default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Click **Resource files** then select **Http url**. For the URL, add `https://nf-xpack.seqera.io/azcopy/linux_amd64_10.8.0/azcopy` and for File path type `azcopy`. Every other setting can be left default.
Select **Resource files** then select **Http url**. For the **URL**, add `https://nf-xpack.seqera.io/azcopy/linux_amd64_10.8.0/azcopy` and for **File path** enter `azcopy`. Every other setting can be left default.


:::note
When not using Fusion, every node **must** have `azcopy` installed.
:::

1. **Task Slots**: Set task slots to the number of vCPUs the machine has, e.g. select `4` for a `Standard_D4_v3` VM size.
1. **Task scheduling policy**: This can be set to `Pack` or `Spread`. `Pack` will attempt to schedule tasks from the same job on the same VM, while `Spread` will attempt to distribute tasks evenly across VMs.
1. **Virtual Network**: If using a virtual network, you can select it here. Be sure to select the correct virtual network and subnet. The virtual machines require:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. **Virtual Network**: If using a virtual network, you can select it here. Be sure to select the correct virtual network and subnet. The virtual machines require:
1. **Virtual Network**: If you are using a virtual network, you can select it here. Be sure to select the correct virtual network and subnet. The VMs require:

- Access to container registries (e.g. quay.io, docker.io) to pull containers
- Access to Azure Storage to copy data using `azcopy`
- Access to any remote files required by the pipeline e.g. AWS S3.
- Communication with the head node (running Nextflow) and Seqera Platform to relay logs and information
Note that overly restrictive networking may prevent pipelines from running successfully.
1. **Mount configuration**: Nextflow *only* supports Azure File Shares. Select `Azure Files Share`, then add:
- **Source**: URL in format `https://${accountName}.file.core.windows.net/${fileShareName}`
- **Relative mount path**: Path where the file share will be mounted on the VM
- **Storage account name** and **Storage account key** (managed identity is not supported)

Leave the node pool to start and create a single Azure VM. Monitor the VM to ensure it starts correctly. If any errors occur, check and correct them - you may need to create a new Azure node pool if issues persist.

The following settings can be modified after creating a pool:

- Autoscale formula
- Start Task
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Start Task
- Start task

- Application packages
- Node communication
- Metadata

**Create a manual Seqera Azure Batch compute environment**

1. In a workspace, select **Compute Environments > New Environment**.
1. Enter a descriptive name for this environment, such as _Azure Batch (east-us)_.
Expand Down Expand Up @@ -311,7 +433,7 @@ Create a manual Seqera Azure Batch compute environment:
Configuration settings in this field override the same values in the pipeline repository `nextflow.config` file. See [Nextflow config file](../launch/advanced.mdx#nextflow-config-file) for more information on configuration priority.
:::
:::info
To use managed identities, Platform requires Nextflow version 24.06.0-edge or later. Add `export NXF_VER=24.06.0-edge` to the **Global Nextflow config** field for your compute environment to use this Nextflow version by default.
To use managed identities, Platform requires Nextflow version 24.06.0-edge or later.
adamrtalbot marked this conversation as resolved.
Show resolved Hide resolved
:::
1. Define custom **Environment Variables** for the **Head Job** and/or **Compute Jobs**.
1. Configure any necessary advanced options:
Expand All @@ -323,23 +445,6 @@ Create a manual Seqera Azure Batch compute environment:
See [Launch pipelines](../launch/launchpad.mdx) to start executing workflows in your Azure Batch compute environment.
:::

### Managed identity

:::info
To use managed identities, Platform requires requires Nextflow version 24.06.0-edge or later. Add `export NXF_VER=24.06.0-edge` to the **Global Nextflow config** field in advanced options for your compute environment to use this Nextflow version by default (see manual instructions above).
:::

Nextflow can authenticate to Azure services using a managed identity. This method offers enhanced security compared to access keys, but must run on Azure infrastructure.

When you use a manually configured compute environment with a managed identity attached to the Azure Batch Pool, Nextflow can use this managed identity for authentication. However, Platform still needs to use access keys or an Entra service principal to submit the initial task to Azure Batch to run Nextflow, which will then proceed with the managed identity for subsequent authentication.

1. In Azure, create a user-assigned managed identity. See [Manage user-assigned managed identities](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-manage-user-assigned-managed-identities) for detailed steps. After creation, record the Client ID of the managed identity.
1. The user-assigned managed identity must have the necessary access roles for Nextflow. See [Required role assignments](https://www.nextflow.io/docs/latest/azure.html#required-role-assignments) for more information.
1. Associate the user-assigned managed identity with the Azure Batch Pool. See [Set up managed identity in your Batch pool](https://learn.microsoft.com/en-us/troubleshoot/azure/hpc/batch/use-managed-identities-azure-batch-account-pool#set-up-managed-identity-in-your-batch-pool) for more information.
1. When you set up the Platform compute environment, select the Azure Batch pool by name and enter the managed identity client ID in the specified field as instructed above.

When you submit a pipeline to this compute environment, Nextflow will authenticate using the managed identity associated with the Azure Batch node it runs on, rather than relying on access keys.

[az-data-residency]: https://azure.microsoft.com/en-gb/explore/global-infrastructure/data-residency/#select-geography
[az-batch-quotas]: https://docs.microsoft.com/en-us/azure/batch/batch-quota-limit#view-batch-quotas
[az-vm-sizes]: https://learn.microsoft.com/en-us/azure/virtual-machines/sizes
Expand All @@ -351,7 +456,12 @@ When you submit a pipeline to this compute environment, Nextflow will authentica
[az-learn-jobs]: https://learn.microsoft.com/en-us/azure/batch/jobs-and-tasks
[az-create-rg]: https://portal.azure.com/#create/Microsoft.ResourceGroup
[az-create-storage]: https://portal.azure.com/#create/Microsoft.StorageAccount-ARM
[az-create-sp]: https://learn.microsoft.com/en-us/entra/identity-platform/howto-create-service-principal-portal
[az-premium-storage]: https://learn.microsoft.com/en-us/azure/virtual-machines/premium-storage-performance
[az-vm-gen]: https://learn.microsoft.com/en-us/azure/virtual-machines/generation-2
[az-disk-type]: https://learn.microsoft.com/en-us/azure/virtual-machines/disks-types
[az-batch-autoscale]: https://learn.microsoft.com/en-us/azure/batch/batch-automatic-scaling
[az-file-shares]: https://www.nextflow.io/docs/latest/azure.html#azure-file-shares
[az-vm-sizes]: https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/overview

[wave-docs]: https://docs.seqera.io/wave
[fusion-docs]: https://docs.seqera.io/fusion
Loading