-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add instructions for manually configuring an Azure Batch pool #325
base: master
Are you sure you want to change the base?
Conversation
❌ Deploy Preview for seqera-docs failed. Why did it fail? →
|
✅ Deploy Preview for seqera-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamrtalbot These changes are great. There is nothing in my notes that is missing here, except for the IPs used by Seqera platform. I had to add those IPs to the Storage Account's firewall.
While reading, I found an explanation for all settings and I really appreciate that.
Important: during our call, when setting up the compute environment in Seqera, you were able to tell me "if that field is not autocompleted, it means that something is off in X or Y". Do you think there is space in this documentation for this knowledge? Or maybe in the setup page itself?
For example, if the container names do not pop up, it means that Seqera has no (network) access to the Storage Account. Or that the correct roles are not set for the provided service principal and managed identity. It'd be great if the page informed the user.
Same goes for the pools. It is expected that the names pop up. Correct?
#### Entra service principal | ||
#### Entra service principal and managed identity | ||
|
||
If using Entra for authentication, you must also create a service principal and managed identity. Seqera Platform uses the Service Principal to authenticate to Azure Batch and storage. It submits a Nextflow task as the head process to run Nextflow, which authenticates to Azure Batch and storage using the Managed Identity attached to the node pool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If using Entra for authentication, you must also create a service principal and managed identity. Seqera Platform uses the Service Principal to authenticate to Azure Batch and storage. It submits a Nextflow task as the head process to run Nextflow, which authenticates to Azure Batch and storage using the Managed Identity attached to the node pool. | |
If using Entra for authentication, you must create a service principal and a managed identity. Seqera Platform uses the Service Principal to authenticate to Azure Batch and storage. It submits a Nextflow task as the head process to run Nextflow, which authenticates to Azure Batch and storage using the Managed Identity attached to the node pool. |
If "storage" refers to "azure storage account", it might be worth replacing all occurrences with "Azure Storage Account".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts and observations on a quick read through. Will make time to actually try the steps later.
#### Entra service principal | ||
#### Entra service principal and managed identity | ||
|
||
If using Entra for authentication, you must create a service principal and managed identity. Seqera Platform uses the Service Principal to authenticate to Azure Batch and Azure Storage. It submits a Nextflow task as the head process to run Nextflow, which authenticates to Azure Batch and storage using the Managed Identity attached to the node pool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ Should there be links out to MSFT-managed docs explaining SPs? Oh I see, it's described in a section above. Point still holds true (just for the earlier description). Nevermind, I see you have it below.
⛏️ I'd probably frontload the explanations, but this works too.
#### Entra service principal | ||
#### Entra service principal and managed identity | ||
|
||
If using Entra for authentication, you must create a service principal and managed identity. Seqera Platform uses the Service Principal to authenticate to Azure Batch and Azure Storage. It submits a Nextflow task as the head process to run Nextflow, which authenticates to Azure Batch and storage using the Managed Identity attached to the node pool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ SP type qualification necessary? (i.e. user-managed). Nevermind, I see this is handled below.
#### Entra service principal | ||
#### Entra service principal and managed identity | ||
|
||
If using Entra for authentication, you must create a service principal and managed identity. Seqera Platform uses the Service Principal to authenticate to Azure Batch and Azure Storage. It submits a Nextflow task as the head process to run Nextflow, which authenticates to Azure Batch and storage using the Managed Identity attached to the node pool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ Do our docs adequately cover how to attach an SP to an Azure Batch pool (I haven't checked)? If yes, does this cover both Manual and Forge flows (assuming both are applicable)?
|
||
If using Entra for authentication, you must create a service principal and managed identity. Seqera Platform uses the Service Principal to authenticate to Azure Batch and Azure Storage. It submits a Nextflow task as the head process to run Nextflow, which authenticates to Azure Batch and storage using the Managed Identity attached to the node pool. | ||
|
||
Therefore, you must create both an Entra service principal and a managed identity. You add the service principal to your Seqera Platform credentials and attach the managed identity to your Azure Batch node pool which will run Nextflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this correctly, two different MSFT identities need to be used:
-
Service Principle type -- which can accommodate the fact that Tower lives outside the Azure network and likely is calling across the public internet.
-
Managed Identity -- can be used since the Pool lives inside the account and thus has a greater assurance level.
Correct?
Yes, explained below:
When you use a manually configured compute environment with a managed identity attached to the Azure Batch Pool, Nextflow can use this managed identity for authentication. However, Platform still needs to use access keys or an Entra service principal to submit the initial task to Azure Batch to run Nextflow, which will then proceed with the managed identity for subsequent authentication.
|
||
In general, we recommend using the E family of machines for bioinformatics workloads since these are cost effective, widely available and sufficiently fast. | ||
|
||
1. **vCPUs**: The number of vCPUs the machine has. This is the main factor in determining the speed of the machine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 Personally I like this level of detail, but there is a risk of it getting out of date. Docs probably will need to set themselves a recurring task to go check accuracy every so often.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the user must decide on the specific machine type, stronger guidance is required than something like AWS.
I've tried to keep it generic (families, features) that shouldn't change much but we will need to add new categories as they are included.
|
||
This section is for users with a pre-configured Azure Batch pool. This requires an existing Azure Batch account with an existing pool. | ||
It is possible to set up Seqera Platform to use a pre-existing Azure Batch pool. This allows the use of more advanced Azure Batch features, such as custom VM images and private networking. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we link out to MSFT pages that explain the benefits of why you might want these features?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
platform_versioned_docs/version-24.2/compute-envs/azure-batch.mdx
Outdated
Show resolved
Hide resolved
@@ -250,15 +304,83 @@ Create a Batch Forge Azure Batch compute environment: | |||
See [Launch pipelines](../launch/launchpad.mdx) to start executing workflows in your Azure Batch compute environment. | |||
::: | |||
|
|||
## Manual | |||
### Manual |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ I don't see Forge here. Is that an omission by accident, or can the SP / Managed Identity config only occur via Manual creation for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only manual, the API for adding a managed identity to a compute pool doesn't work.
Signed-off-by: Adam Talbot <[email protected]>
platform_versioned_docs/version-24.2/compute-envs/azure-batch.mdx
Outdated
Show resolved
Hide resolved
Signed-off-by: Adam Talbot <[email protected]>
``` | ||
// Compute the target nodes based on pending tasks. | ||
// $PendingTasks == The sum of $ActiveTasks and $RunningTasks | ||
$samples = $PendingTasks.GetSamplePercent(interval); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AutoScalingFormulaEvaluationError: The specified auto-scaling formula has evaluation error
Message: Line 3, Col 43: Undefined symbol: interval
Result: $TargetDedicatedNodes=0;$TargetLowPriorityNodes=0;$NodeDeallocationOption=requeue
PropertyName: formula
PropertyPath: properties.scaleSettings.autoScale.formula
Something missing?
// Compute the target nodes based on pending tasks. | ||
// $PendingTasks == The sum of $ActiveTasks and $RunningTasks | ||
$samples = $PendingTasks.GetSamplePercent(interval); | ||
$tasks = $samples < 70 ? max(0, $PendingTasks.GetSample(1)) : max( $PendingTasks.GetSample(1), avg($PendingTasks.GetSample(interval))); | ||
$targetVMs = $tasks > 0 ? $tasks : max(0, $TargetDedicatedNodes/2); | ||
targetPoolSize = max(0, min($targetVMs, 8)); | ||
|
||
// For first interval deploy 1 node, for other intervals scale up/down as per tasks. | ||
$TargetDedicatedNodes = targetPoolSize; | ||
$NodeDeallocationOption = taskcompletion; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm I tried to simplify it. Try this:
// Compute the target nodes based on pending tasks. | |
// $PendingTasks == The sum of $ActiveTasks and $RunningTasks | |
$samples = $PendingTasks.GetSamplePercent(interval); | |
$tasks = $samples < 70 ? max(0, $PendingTasks.GetSample(1)) : max( $PendingTasks.GetSample(1), avg($PendingTasks.GetSample(interval))); | |
$targetVMs = $tasks > 0 ? $tasks : max(0, $TargetDedicatedNodes/2); | |
targetPoolSize = max(0, min($targetVMs, 8)); | |
// For first interval deploy 1 node, for other intervals scale up/down as per tasks. | |
$TargetDedicatedNodes = targetPoolSize; | |
$NodeDeallocationOption = taskcompletion; | |
// Get pool lifetime since creation. | |
lifespan = time() - time("2024-10-30T00:00:00.880011Z"); | |
interval = TimeInterval_Minute * 5; | |
// Compute the target nodes based on pending tasks. | |
// $PendingTasks == The sum of $ActiveTasks and $RunningTasks | |
$samples = $PendingTasks.GetSamplePercent(interval); | |
$tasks = $samples < 70 ? max(0, $PendingTasks.GetSample(1)) : max( $PendingTasks.GetSample(1), avg($PendingTasks.GetSample(interval))); | |
$targetVMs = $tasks > 0 ? $tasks : max(0, $TargetDedicatedNodes/2); | |
targetPoolSize = max(0, min($targetVMs, 8)); | |
// For first interval deploy 1 node, for other intervals scale up/down as per tasks. | |
$TargetLowPriorityNodes = lifespan < interval ? 1 : targetPoolSize; | |
$NodeDeallocationOption = taskcompletion; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this I was able to create the pool.
Documentation for adding an Azure Batch pool manually so it is compatible with Seqera and Nextflow.