-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add volumeattach/volumedetach API #4359
Conversation
9452cc3
to
aeff486
Compare
I assume there will be a follow-up PR to call this APIs. Is that correct? @andrewd-zededa can you review? |
That's correct. We are trying to split PRs so that they are independent and can compile, easy to review too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM mostly, I think we need a some comments around the volumeattachment logging to ensure they are not downgraded to Function/Trace in the future.
// Get the Kubernetes clientset | ||
clientset, err := GetClientSet() | ||
if err != nil { | ||
err = fmt.Errorf("failed to get clientset: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A new error it's being created here, why you just don't propagate the original error (from the Understood, I was looking to line 355, now I think you should use the same approach used here there... I will mark with a comment....GetClientSet()
call)?
aeff486
to
303af8e
Compare
Can someone bless this one, if there are no further concerns. |
@andrewd-zededa, @zedi-pramodh was this comment addressed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I don't understand what Andrew meant actually :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe @andrewd-zededa it's OK, as I cannot review it properly myself. Hence, I am approving. Nevertheless, I left several questions to better understand the code.
return "", err | ||
} | ||
// Get the PersistentVolumeClaim (PVC) | ||
pvc, err := clientset.CoreV1().PersistentVolumeClaims(EVEKubeNameSpace).Get(context.TODO(), pvcName, metav1.GetOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just curious what is the context.TODO()
in this case and will it be changed later...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I'm a little bit worried about the amount of dereferencing in one line. Could be hard to debug if one of them fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The err should contain the actual error if something fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The err
variable will catch only the error returned by the final .Get
method call, not by each method in the chain (clientset.CoreV1()
, PersistentVolumeClaims(EVEKubeNameSpace)
). If there's a subtle issue in one of those intermediate steps, like clientset
or EVEKubeNameSpace
, you might not catch it right away.
But it's a minor for the PR.
return "", "", err | ||
} | ||
// List all VolumeAttachments and find the one corresponding to the PV | ||
volumeAttachments, err := clientset.StorageV1().VolumeAttachments().List(context.TODO(), metav1.ListOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a synchronous call to the API? Are we sure we will not have issues with the watchdog in this case if the requests are handled longer than expected? This is a generic question for all API calls here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its is synchronous call, but I do not anticipate it takes even a second to get the result. Number of volumes we have in our deployments are in single digits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, and what about the other API calls in the file? Like .Delete()
... So we are pretty sure all of them run fast enough, yep?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far we have not seen any issues with watchdog for the all kubernetes calls. Also I thought domainmgr/volumemgr executes the jobs in asynchronous worker threads and should not block anything to trigger watchdog. Is that not right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also Kubernetes API calls execution are asynchronous by default. So only sync call is to issue the command and return back after that any action will happen in background.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any more questions ? Or can this be merged ?
This needs to be rebased on master - the SPDX check workflow has issues with misdetecting issues. |
Longhorn volumes are attached to a node where the app is running. Since the volumes we create in eve are RWO, ie only one node can attach at a time, during failover we need to make sure the node that failed is detached from the volume. The node that is taking over will automatically attach to the volume if no other node is attached. Also, generally its very safe to detach and then attach to avoid any potential corruption. Signed-off-by: Pramodh Pallapothu <[email protected]>
303af8e
to
883e3d0
Compare
Rebased to master and amended the commit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-run tests
The same Eden issue =(((( |
Those should be completely unrelated. This code path is not even executed yet. |
Longhorn volumes are attached to a node where the app is running. Since the volumes we create in eve are RWO, ie only one node can attach at a time, during failover we need to make sure the node that failed is detached from the volume.
The node that is taking over will automatically attach to the volume if no other node is attached. Also, generally its very safe to detach and then attach to avoid any potential corruption.