Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't sync namespaces that have no subscriptions #3054

Conversation

awgreene
Copy link
Member

@awgreene awgreene commented Oct 9, 2023

Problem: The catalog operator is logging many errors regarding missing operator groups in namespaces that have no operators installed.

Solution: If a namespace has no subscriptions in it, do not check if an operator group is present in the namespace.

Example Logs:

2023-10-03T18:40:22.531790646Z E1003 18:40:22.531725       1 queueinformer_operator.go:319] sync {"update" "openshift-infra"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:22.732892241Z E1003 18:40:22.732772       1 queueinformer_operator.go:319] sync {"update" "openshift-cluster-node-tuning-operator"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:22.932286038Z E1003 18:40:22.932185       1 queueinformer_operator.go:319] sync {"update" "openshift-etcd"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:23.132380585Z E1003 18:40:23.132315       1 queueinformer_operator.go:319] sync {"update" "openshift-console"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:23.331364253Z E1003 18:40:23.331310       1 queueinformer_operator.go:319] sync {"update" "openshift-apiserver-operator"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:23.531841128Z E1003 18:40:23.531783       1 queueinformer_operator.go:319] sync {"update" "openshift-config"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:23.731587861Z E1003 18:40:23.731523       1 queueinformer_operator.go:319] sync {"update" "openshift-dns"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:23.935364693Z E1003 18:40:23.935308       1 queueinformer_operator.go:319] sync {"update" "openshift-kube-controller-manager-operator"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:24.131623729Z E1003 18:40:24.131432       1 queueinformer_operator.go:319] sync {"update" "openshift-kube-scheduler"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:24.332286341Z E1003 18:40:24.332188       1 queueinformer_operator.go:319] sync {"update" "openshift-ovirt-infra"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:24.532904649Z E1003 18:40:24.532854       1 queueinformer_operator.go:319] sync {"update" "openshift-ingress-operator"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:24.735185533Z E1003 18:40:24.735124       1 queueinformer_operator.go:319] sync {"update" "openshift-kube-apiserver-operator"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:24.931656181Z E1003 18:40:24.931571       1 queueinformer_operator.go:319] sync {"update" "openshift-kube-scheduler-operator"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:25.133418536Z E1003 18:40:25.133310       1 queueinformer_operator.go:319] sync {"update" "openshift-ingress-canary"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:25.331763068Z E1003 18:40:25.331667       1 queueinformer_operator.go:319] sync {"update" "openshift-console-operator"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:25.533077389Z E1003 18:40:25.532998       1 queueinformer_operator.go:319] sync {"update" "openshift-openstack-infra"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:25.733136965Z E1003 18:40:25.733072       1 queueinformer_operator.go:319] sync {"update" "kube-public"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:25.933430183Z E1003 18:40:25.933360       1 queueinformer_operator.go:319] sync {"update" "openshift-cluster-machine-approver"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:26.132186772Z E1003 18:40:26.132125       1 queueinformer_operator.go:319] sync {"update" "openshift-etcd-operator"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:26.333601507Z E1003 18:40:26.333433       1 queueinformer_operator.go:319] sync {"update" "openshift"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:26.532490813Z E1003 18:40:26.532426       1 queueinformer_operator.go:319] sync {"update" "openshift-authentication-operator"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:26.732186081Z E1003 18:40:26.732130       1 queueinformer_operator.go:319] sync {"update" "openshift-controller-manager"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:26.933010662Z E1003 18:40:26.932946       1 queueinformer_operator.go:319] sync {"update" "openshift-insights"} failed: found 0 operatorGroups, expected 1
2023-10-03T18:40:27.133304680Z E1003 18:40:27.132522       1 queueinformer_operator.go:319] sync {"update" "openshift-kni-infra"} failed: found 0 operatorGroups, expected 1

Problem: The catalog operator is logging many errors regarding missing
operator groups in namespaces that have no operators installed.

Solution: If a namespace has no subscriptions in it, do not check if an
operator group is present in the namespace.

Signed-off-by: Alexander Greene <[email protected]>
@openshift-ci openshift-ci bot requested review from asmacdo and joelanford October 9, 2023 13:28
@openshift-ci
Copy link

openshift-ci bot commented Oct 9, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: awgreene

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 9, 2023
@joelanford
Copy link
Member

This seems reasonable, but one question: What happens with a subscription shows up? When we're reconciling that subscription, are we assuming that we've already synced the OperatorGroups? If so, do we then put an error on the subscription along the lines of "there's no operator group in this namespace"? And if we do that, is that a terminal state, or will we pretty quickly sync operator groups and then re-sync the subscription?

@awgreene
Copy link
Member Author

awgreene commented Oct 9, 2023

This seems reasonable, but one question: What happens with a subscription shows up? When we're reconciling that subscription, are we assuming that we've already synced the OperatorGroups? If so, do we then put an error on the subscription along the lines of "there's no operator group in this namespace"?

@joe currently there is no subscription status conveying that an operatorGroup is missing, but the catalog operator would log the error.

And if we do that, is that a terminal state, or will we pretty quickly sync operator groups and then re-sync the subscription?

When the Catalog Operator reconciles a subscription it will add the namespace to the queue which is reconciled using the logic in the syncNamespace function. Likewise, if an operatorGroup is created it will queue namespaces as well. So yes, it will recover quickly.

@awgreene
Copy link
Member Author

awgreene commented Oct 9, 2023

@joelanford we had a downstream bug that called out that it'd be nice if the sub exposed this information, but we closed it as we are directing efforts to working on OLM V1. If we feel strongly that OLM V0 should expose this outside of the logs, we can reopen that issue.

@stevekuznetsov
Copy link
Member

Since the queueing logic was happening before by iterating over subs, this seems like it should not change any behavior, just stop logging errors.

@stevekuznetsov stevekuznetsov added this pull request to the merge queue Oct 10, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 10, 2023
@stevekuznetsov
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 10, 2023
@stevekuznetsov stevekuznetsov added this pull request to the merge queue Oct 10, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 10, 2023
@stevekuznetsov stevekuznetsov added this pull request to the merge queue Oct 10, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 10, 2023
@stevekuznetsov stevekuznetsov added this pull request to the merge queue Oct 11, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 11, 2023
@stevekuznetsov stevekuznetsov added this pull request to the merge queue Oct 11, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 11, 2023
@stevekuznetsov
Copy link
Member

Flaked on #3062 and #3056

@stevekuznetsov stevekuznetsov added this pull request to the merge queue Oct 11, 2023
Merged via the queue into operator-framework:master with commit 2625ded Oct 11, 2023
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants