-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add implementation for NodeUpgradeController #7061
Add implementation for NodeUpgradeController #7061
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #7061 +/- ##
==========================================
+ Coverage 71.34% 71.56% +0.21%
==========================================
Files 544 545 +1
Lines 41963 42318 +355
==========================================
+ Hits 29940 30283 +343
+ Misses 10345 10343 -2
- Partials 1678 1692 +14 ☔ View full report in Codecov by Sentry. |
20e37a0
to
9da966e
Compare
dc93569
to
1a24c09
Compare
b4a58d6
to
874a81e
Compare
|
||
func getInitContainerStatus(pod *corev1.Pod, containerName string) (*corev1.ContainerStatus, error) { | ||
for _, status := range pod.Status.InitContainerStatuses { | ||
if status.Name == containerName { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want a mapping in the future to make it more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I can add that in a follow up
} | ||
|
||
// UpgradeWorkerPod returns an upgrader pod that can be deployed on worker nodes. | ||
func UpgradeWorkerPod(nodeName, image string) *corev1.Pod { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These feel like they should be member methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, let's discuss this offline and see if this is something we can acknowledge in a follow up
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhinavmpandey08 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/lgtm |
/unhold |
- apiGroups: | ||
- "" | ||
resources: | ||
- pods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to move this to a role instead of cluster role
this gives way too many privileges to the controller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is needed anymore since the controller is building a remote client from the kubeconfig stored on the cluster which has admin on workload clusters. I'll do some testing and remove it
NodeUpgradeKind = "NodeUpgrade" | ||
|
||
// UpgraderPodCreated reports whether the upgrader pod has been created for the node upgrade. | ||
UpgraderPodCreated ConditionType = "UpgraderPodCreated" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it's wise making the phases of an upgrade tied in this way to the API. Your phases might change more frequently than you want your API to change. Also, what if some phases become optional? How are client going to know that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it's wise making the phases of an upgrade tied in this way to the API. Your phases might change more frequently than you want your API to change.
Can you elaborate a little on this? This is just adding container statuses on the node upgrade object.
The controller will check the container statuses at the end of every reconcile loop and update the node upgrade status based on that. And once all the containers have finished, it will mark the upgrade as completed and end the reconcile loop.
I extrapolated this logic from the EKS-A controller which does something similar where it monitors the KCP/MD/CAPI cluster conditions and uses it to update EKS-A Cluster conditions. And when all conditions are met, the reconcile is marked complete.
Also, what if some phases become optional? How are client going to know that?
This depends on how we decide to implement it. We were thinking we won't make the phases optional from NodeUpgradeController's perspective. Rather the pod will check if the step it's performing needs to be done or not. If it doesn't need to be performed, the upgrader will just not do it and return a success.
For example, if the containerd version on the node is already the latest one, then the upgrader can just skip copying over the containerd binary and return without errors.
Description of changes:
The implementation is based on this design #6893
The goal of this PR is to add the implementation for the NodeUpgradeController.
The purpose of this controller is to handle in-place upgrade of nodes by provisioning upgrader pods on the individual nodes that go in and upgrade the components on the node to the desired version.
These components include:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.