Parallelization of apply groups #1408

vs-odie · 2024-08-28T19:20:55Z

vs-odie
Aug 28, 2024

The current implementation of the grouping of reconciler apply groups has a major disadvantage:

Currently, resources from different dependency trees are mixed together in apply groups. All resources in an apply group are dependent on the provisioning of the slowest resource within that apply-group.

Let's assume you apply the following applications in a push to the synced Git repository:

Application 1 depends on an S3 bucket and a namespace.
Application 2 depends on a namespace.
This results in the following apply groups:

Group 1: Namespaces for both applications
Group 2: S3 bucket for Application 1 + Deployment of Application 2
Group 3: Deployment of Application 1

If the creation of the S3 bucket in Group 2 takes 30 seconds, but the start of the application takes 5 minutes, then the start of Application 2 blocks the start of Application 1 for 4 minutes and 30 seconds, since Group 2 is only completed when the last resource in the group (i.e., Application 2) has finished.

If the start of Application 1, for example, takes 3 minutes, then the total time for the entire reconciliation process is around 8 minutes, even though it could be completed in 5 minutes if the apply-groups were parallelized.

Ideally, there would be two apply group strands:

Group 1 Application 1: Namespace for Application 1
Group 2 Application 1: S3 bucket for Application 1
Group 3 Application 1: Deployment of Application 1

Group 1 Application 2: Namespace for Application 2
Group 2 Application 2: Deployment of Application 2

Alternatively, you could introduce an annotation for resources, in which resources are grouped into a strand, if a fully automatic analysis of the dependency trees is too complex. This way, one could manually group all resources via an annotation, which belong to Application 1 or Application 2.

In our environment, the current implementation is a significant limitation because we deploy entire stages, consisting of multiple applications, simultaneously. This adds up to more than 150 resources that need to be applied, which are divided into 7 to 8 apply-groups. Due to the current implementation, we experience many unnecessary wait times in apply-groups, which add up to several minutes in total execution time.

I hope I was able to clearly explain our current problem. What do you think of the proposed change? Is it realistic to make a change in this direction and parallelize the apply-groups?

br,
roland

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelization of apply groups #1408

{{title}}

Replies: 0 comments

Select a reply

Parallelization of apply groups #1408

vs-odie Aug 28, 2024

Replies: 0 comments

vs-odie
Aug 28, 2024