You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When testing combinations of resources that depend on each other with Usages, the delete step fails even though the cleanup eventually succeeds. The test contained following resources:
XAWSLBController
XEKS
XNetwork
XAWSLBController contains a helm chart and a Usage of XEKS by a helm Release.
Running uptest against such a configuration leads to the following situation:
We delete XAWSLBController first which succeeds immediately
Afterwards we delete XEKS and get en error because cleanup in the background of XAWSLBController didn't finish yet, the Usage is still around
This is the error we're seeing:
"nousages.apiextensions.crossplane.io" denied the request: This resource is in-use by 1 Usage(s), including the Usage "configuration-aws-lb-controller-7d62z" by resource Release/configuration-aws-lb-controller-xmlng.
Working around this as of today is possible with a pre-delete or post-delete hook but we feel that it's not a great approach as it introduces additional hurdles for people trying uptest and it's leaking orchestration details which are already handled inside the core of crossplane.
Things we tried and discussed:
omit "wait: true" from the delete statement. This has the same effect.
run delete steps with || true. This is possible and cleanup will succeed but we will swallow all errors regardless if it's connected to usages or not
running in a loop and catch exit-code and stderr, then compare with stderr on failure with "nousages.apiextensions.crossplane.io" denied the request. Can work but can cause trouble with additional pre-delete and similar hooks as they now would require to be idempotent or we need to catch additional errors.
finding all usages via owner-references connected to the current xr to delete and issue kubectl wait to wait for usages to be cleaned up before proceeding. Can also work but leaps deeply into internals.
Another thought: It might be that we're trying to work around a behavior which is actively not supported in chainsaw and this is why proposed solutions look kinda ugly. Maybe it's worthwhile creating a PR on chainsaw introducing something like retry paramter for a script step. At the very least we'll get some feedback from maintainers how they envision such a flow to work because trying to delete an object temporarily protected by an admission webhook might be not a standard usecase but i can imagine other setups can hit the same wall without being specific to crossplane.
Sidenote: This only occurs when the objects in question are NOT part of the same composition, as in this case deletion-errors are not visible to the outside and crossplane handles the cleanup flawlessly.
What happened?
When testing combinations of resources that depend on each other with Usages, the delete step fails even though the cleanup eventually succeeds. The test contained following resources:
XAWSLBController contains a helm chart and a Usage of XEKS by a helm Release.
Running uptest against such a configuration leads to the following situation:
This is the error we're seeing:
Working around this as of today is possible with a
pre-delete
orpost-delete
hook but we feel that it's not a great approach as it introduces additional hurdles for people trying uptest and it's leaking orchestration details which are already handled inside the core of crossplane.Things we tried and discussed:
|| true
. This is possible and cleanup will succeed but we will swallow all errors regardless if it's connected to usages or not"nousages.apiextensions.crossplane.io" denied the request
. Can work but can cause trouble with additionalpre-delete
and similar hooks as they now would require to be idempotent or we need to catch additional errors.kubectl wait
to wait for usages to be cleaned up before proceeding. Can also work but leaps deeply into internals.Another thought: It might be that we're trying to work around a behavior which is actively not supported in chainsaw and this is why proposed solutions look kinda ugly. Maybe it's worthwhile creating a PR on chainsaw introducing something like retry paramter for a
script
step. At the very least we'll get some feedback from maintainers how they envision such a flow to work because trying to delete an object temporarily protected by an admission webhook might be not a standard usecase but i can imagine other setups can hit the same wall without being specific to crossplane.Sidenote: This only occurs when the objects in question are NOT part of the same composition, as in this case deletion-errors are not visible to the outside and crossplane handles the cleanup flawlessly.
How can we reproduce it?
Running uptest on this changeset without including the post-delete hook: upbound/configuration-aws-lb-controller#1
What environment did it happen in?
The text was updated successfully, but these errors were encountered: