Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for KSM #1056

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

paramite
Copy link
Contributor

@paramite paramite commented Sep 10, 2024

This patch add default image and TLS support for kube-state-metrics, a new component of Telemetry/Ceilometer resource.

Related: OBSDA-574
Related: OSPRH-1052
Closes: OSPRH-10462

Copy link

This change depends on a change that failed to merge.

Change openstack-k8s-operators/telemetry-operator#337 is needed.

Copy link

Merge Failed.

This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset.
Warning:
Error merging github.com/openstack-k8s-operators/telemetry-operator for 337,7a5096b1cfcbc4cfc34a8a28fc3efca226fb3d5f

@paramite
Copy link
Contributor Author

recheck

@paramite
Copy link
Contributor Author

/test openstack-operator-build-deploy-kuttl

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/f3a8edbee473408fa4dc6a00933a3393

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 00m 20s
podified-multinode-edpm-deployment-crc FAILURE in 1h 03m 25s
cifmw-crc-podified-edpm-baremetal FAILURE in 1h 05m 18s
adoption-standalone-to-crc-ceph-provider RETRY_LIMIT in 1h 09m 51s
openstack-operator-tempest-multinode FAILURE in 1h 09m 01s

@paramite
Copy link
Contributor Author

/test openstack-operator-build-deploy-kuttl

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/a6512f9c32cc4c669e6eb80f44efa09b

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 27m 18s
podified-multinode-edpm-deployment-crc FAILURE in 1h 06m 22s
cifmw-crc-podified-edpm-baremetal FAILURE in 1h 10m 55s
adoption-standalone-to-crc-ceph-provider RETRY_LIMIT in 1h 11m 33s
openstack-operator-tempest-multinode FAILURE in 1h 12m 04s

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/ed3cbe5bb1164d639ec157cf429fd6ff

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 27m 14s
podified-multinode-edpm-deployment-crc FAILURE in 1h 06m 14s
cifmw-crc-podified-edpm-baremetal FAILURE in 1h 15m 32s
adoption-standalone-to-crc-ceph-provider RETRY_LIMIT in 1h 13m 08s
openstack-operator-tempest-multinode FAILURE in 1h 09m 07s

Copy link

Merge Failed.

This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset.
Warning:
Error merging github.com/openstack-k8s-operators/openstack-operator for 1056,6b48a538a9025e8cc451339acc1c11e9a10e9c63

@paramite
Copy link
Contributor Author

/retest

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/af93f350b0244336bb58530b9512776c

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 56m 37s
podified-multinode-edpm-deployment-crc POST_FAILURE in 1h 02m 36s
cifmw-crc-podified-edpm-baremetal FAILURE in 1h 16m 16s
adoption-standalone-to-crc-ceph-provider RETRY_LIMIT in 1h 12m 00s
openstack-operator-tempest-multinode FAILURE in 1h 07m 34s

@dprince
Copy link
Contributor

dprince commented Oct 18, 2024

looks good to me

Copy link
Contributor

@abays abays left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Contributor

openshift-ci bot commented Oct 18, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abays, paramite

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dprince
Copy link
Contributor

dprince commented Oct 18, 2024

So we are adding a new image version to the struct. But it wouldn't exist for the old OpenStackVersion.Status Yaml I think. I would be unset there. I'm not sure I have tested this case where version is set to the old version, image is nil but the service is enabled. May have to wait till an upgrade to a specific service level exists to enable this service.

@abays
Copy link
Contributor

abays commented Oct 18, 2024

So we are adding a new image version to the struct. But it wouldn't exist for the old OpenStackVersion.Status Yaml I think. I would be unset there. I'm not sure I have tested this case where version is set to the old version, image is nil but the service is enabled. May have to wait till an upgrade to a specific service level exists to enable this service.

/hold

@dprince
Copy link
Contributor

dprince commented Oct 21, 2024

To clarify the concerns here: old struct has value set to nil (as it is net new it wouldn't have been initialized). Customer has the old target version set and they upgrade to the new operator codebase which upon reconciling will attempt to install this new service but will fail as there will be no container image. I think we need to either set the default (temporarily) to disabled or somehow guarantee they are at a specific version until the image is set.

Copy link
Contributor

openshift-ci bot commented Nov 14, 2024

New changes are detected. LGTM label has been removed.

@paramite
Copy link
Contributor Author

@dprince Is the last commit acceptable?

func ensureKsmImageDefault(version *corev1beta1.OpenStackVersion) *string {
// NOTE(mmagr): for case upgrade case where KSM image is not part of OpenStackVersion
// yet we temporarily need to ensure image default is passed
var ksmImageTempDefault = "registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.10.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll want to avoid using hard coded images as it would need to be set in RELATED_IMAGES as well to ensure it gets downloaded for offline/airgapped.

@dprince
Copy link
Contributor

dprince commented Nov 18, 2024

@paramite I pushed this: https://github.com/openstack-k8s-operators/telemetry-operator/pull/534 which will allow the OLM upgrade to proceed. The comments I made in this PR still stand and relate to the new image being available in the .Status of the OpenStackVersion resource until we have ksmImage set there (once the minor update process runs)

Copy link
Contributor

openshift-ci bot commented Nov 20, 2024

@paramite: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/openstack-operator-build-deploy-kuttl a0a6c3c link true /test openstack-operator-build-deploy-kuttl

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/f12795b5bfc24b1881465addb4e92058

✔️ openstack-k8s-operators-content-provider SUCCESS in 4h 55m 06s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 20m 56s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 31m 43s
✔️ adoption-standalone-to-crc-ceph-provider SUCCESS in 3h 24m 28s
openstack-operator-tempest-multinode RETRY_LIMIT in 2m 59s

@vyzigold vyzigold mentioned this pull request Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants