Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support exposing the status of individual resources applied on cluster #1412

Open
varshaprasad96 opened this issue Nov 27, 2023 · 3 comments
Labels
carvel-accepted This issue should be considered for future work and that the triage process has been completed enhancement This issue is a feature request

Comments

@varshaprasad96
Copy link

varshaprasad96 commented Nov 27, 2023

Describe the problem/challenge you have

Once the package contents fetched from the source are applied on the cluster through App CR, there is no way to know the health of the individual resources applied on the cluster. It would be helpful to know the status of individual resources applied, on the "status" section of App CR. This would be useful for SRE/ops teams managing hundreds or thousands of clusters - wherein this information can be scraped for monitoring.

Describe the solution you'd like

The controller that manages App, can also watch the individual resources being applied through informers. For core types, like deployments and pods where health information is already available, this can be used and stamped on App CR's status.

Anything else you would like to add:

A similar feature is available an OLM v1's component named Rukpak. Kubernetes-sigs/cli-utils (https://github.com/kubernetes-sigs/cli-utils/tree/master/pkg/kstatus) provides a set of helpers to enable collecting of status from core resource types. More details on the implementation can be found here: https://github.com/operator-framework/rukpak/blob/main/internal/healthchecks/builtin.go#L16-L33

Open questions:

  • Should we trigger reconcile, when any of the resource is unhealthy?
  • Would watching resources through informers increase cache, thereby affecting performance?

Vote on this request

This is an invitation to the community to vote on issues, to help us prioritize our backlog. Use the "smiley face" up to the right of this comment to vote.

👍 "I would like to see this addressed as soon as possible"
👎 "There are other more important things to focus on right now"

We are also happy to receive and review Pull Requests if you want to help working on this issue.

@varshaprasad96 varshaprasad96 added carvel-triage This issue has not yet been reviewed for validity enhancement This issue is a feature request labels Nov 27, 2023
@joelanford
Copy link

Another open question: how would we want to expose status of non-builtin APIs. For example, consider a MongoDB CR that is managed by an App. Do we need a way for package authors to define how to scrape health from custom objects?

I know RukPak does not support this. RukPak just treats unknown types as permanently healthy. Point being, I don't think we necessarily have to include custom types in the scope of this, but something we might want to keep in mind in the design.

@varshaprasad96
Copy link
Author

@joelanford The other option (inspired from package-operator) as discussed in this thread was to allow users to pass in CEL expressions, which the controller will evaluate to decide if the resource is healthy or not.

@praveenrewar praveenrewar added carvel-accepted This issue should be considered for future work and that the triage process has been completed and removed carvel-triage This issue has not yet been reviewed for validity labels Dec 14, 2023
@100mik
Copy link
Contributor

100mik commented Jan 10, 2024

Do we need a way for package authors to define how to scrape health from custom objects?

The go to way for us has been to include some config that is consumed by kapp which allows us to specify "depending on the status of this resource, when is it ready" (we call them waitRules)

This would mean that kapp-controller would only mark the Package as reconciled if these conditions are met.

However, surfacing resource specific conditions is something we might need to think about. (which seems to be the goal here). This would require a new API (probably within the same config) specifying how certain resources can have their statuses converted to additional conditions on the PackageInstall itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
carvel-accepted This issue should be considered for future work and that the triage process has been completed enhancement This issue is a feature request
Projects
Status: No status
Development

No branches or pull requests

4 participants