Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport to 2.7.7 #1270

Merged
merged 7 commits into from
Dec 13, 2024
Merged

Backport to 2.7.7 #1270

merged 7 commits into from
Dec 13, 2024

Conversation

Issue:
This issue is visible with a combination of two problems.
The first problem is if there is some step/phase that takes some time to
finish and halts the process. An example of such an issue MTV-1775.
This causes the VM migration startup to take some time as we don't start
all available VMs at once but we add VMs one by one in the reconcile.
This in large-scale migration can take a long time.
For example on the scale of 200 VMs in best case scenario it would take
10 minutes to start all VMs as we have 3s reconciled.

Fix:
Start all available VMs from the scheduler at once.

Ref: https://issues.redhat.com/browse/MTV-1774

Signed-off-by: Martin Necas <[email protected]>
Issue:
When the Forklift creates the snapshot we need to wait for the task to
finish. Right now we are using task.WaitForResult, this causes the whole
process to wait for the snapshot creation and blocks other VM migrations.
Same problem we have with snapshot removal, if the ESXi host is busy and
we start the snapshot removal the snapshots can take longer than the
reconcile cycle (3s) and we can fail due to it.

Fix:
Instead of using the task.WaitForResult the forklift will start querying
for the latest tasks per VM, by default it's 10 tasks. This querying
will be done in a separate phase then the creation/deletion. So we will
have WaitFor phases for each of the object manipulations.
We find the specific task for the creation/deletion and check its
status. This has the advantage that we are not only getting the status of the
task but in addition also the results of the task, so we can propagate them to
the user, in case the creation/deletion fails.

Ref:
- https://issues.redhat.com/browse/MTV-1753
- https://issues.redhat.com/browse/MTV-1775

Signed-off-by: Martin Necas <[email protected]>
Issue: When creating the VM from vSphere on kubevirt the MTV always
defaulted the secureboot to false.

Fix: Add the secureboot to the inventory and to main controller to pass
it to the KubeVirt.

Ref: https://issues.redhat.com/browse/MTV-1632

Signed-off-by: Martin Necas <[email protected]>
Issue:
The main problem of the MTV-1753 and MTV-1775 is that we are either not
waiting for the VMware task to finish or if we are waiting we are halting
the whole controller process. This causes either performance issues or
even migration failures. So we need to add a mechanism to wait for the
tasks without halting the whole process.

Fix:
My first attempt was in PR kubev2v#1262 which used the event manager. This on
the surface was an easy approach which did not require any additional changes
to the CR. The problem there was that some of the tasks were not
reported to the taskManager. These tasks had a prefix haTask. After some
investigation, I found out that these tasks are directly on the ESXi host
and not sent to the vspehre, so we can't use the taskManager.

This PR adds the taskIds to the status CR so additional wait phases can
monitor the tasks. The main controller will get the ESXi client and
create a property collector to request the specific task from the host.

Ref:
- https://issues.redhat.com/browse/MTV-1753
- https://issues.redhat.com/browse/MTV-1775
- kubev2v#1262
- kubev2v#1265

Signed-off-by: Martin Necas <[email protected]>
Issue:
If the user sets ClusterResourceQuota the Forklift will start failing as it
does not have the limits or requests on the pods which are created form
the Forklift Controller.

Fix:
Add a new parameters to Forklift operator which can be configured
depedning on the user env.

Ref: https://issues.redhat.com/browse/MTV-1493
Signed-off-by: Martin Necas <[email protected]>
@mnecas mnecas added this to the 2.7.7 milestone Dec 13, 2024
@mnecas mnecas requested a review from yaacov as a code owner December 13, 2024 16:59
Copy link

sonarcloud bot commented Dec 13, 2024

@mnecas mnecas merged commit 81959d5 into kubev2v:release-2.7 Dec 13, 2024
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant