Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zuul job to validate watcher deployment #9

Merged
merged 1 commit into from
Nov 26, 2024

Conversation

raukadah
Copy link
Contributor

@raukadah raukadah commented Nov 19, 2024

This pr:

  • Adds watcher-operator-base job from
    podified-multinode-edpm-deployment-crc-2comp parent. This job will
    deploy 2 node EDPM deployment and then deploy watcher operator using
    make targets from watcher-operator repo.
  • It adds hook to deploy watcher service via ci-framework hook.

Test Results: #9 (comment)

Depends-On: #11
Depends-On: openstack-k8s-operators/ci-framework#2569

Copy link

openshift-ci bot commented Nov 19, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link

Zuul encountered a syntax error while parsing its
configuration in the repo openstack-k8s-operators/watcher-operator on branch main. The
problem was:

Invalid Ansible variable name '_watcher_repo' for dictionary value @ data['vars']

The problem appears in the the "watcher-operator-base" job stanza:

job:
name: watcher-operator-base
parent: podified-multinode-edpm-deployment-crc-2comp
dependencies: ["openstack-meta-content-provider"]
description: |
A multinode EDPM Zuul job which has one ansible controller, one
extracted crc and two computes. It will be used for testing watcher-operator.
vars:
_watcher_repo: "{{ ansible_user_dir }}/src/github.com/openstack-k8s-operators/watcher-operator"
...

in "openstack-k8s-operators/watcher-operator/.zuul.yaml@main", line 12

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/85b54bceb3f043e19e6c75d023bd98b6

✔️ openstack-meta-content-provider SUCCESS in 37m 31s
watcher-operator-validation FAILURE in 19m 14s

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/a9fd2b765db446a7aa8c3291f47248fc

✔️ openstack-meta-content-provider SUCCESS in 35m 57s
watcher-operator-validation FAILURE in 20m 14s

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/b8f0e6a28c514f4f8d0beb04f6d15f41

✔️ openstack-meta-content-provider SUCCESS in 1h 25m 25s
watcher-operator-validation POST_FAILURE in 1h 13m 51s

Copy link

This change depends on a change that failed to merge.

Change #11 is needed.

@raukadah
Copy link
Contributor Author

https://logserver.rdoproject.org/9/9/b8858ecba31c26914ae4831406c6ed0cf380d053/github-check/watcher-operator-validation/e56536c/job-output.txt hit post_failure due to following error:

TASK [env_op_images : Get all the pods in openstack-operator namespace kind=Pod, namespace={{
2024-11-19 21:33:14.423860 | controller |   ((csv_items | first).metadata.namespace)
2024-11-19 21:33:14.423874 | controller |   if csv_items | length > 0 else omit
2024-11-19 21:33:14.423888 | controller | }}, kubeconfig={{ cifmw_openshift_kubeconfig }}, api_key={{ cifmw_openshift_token | default(omit)}}, context={{ cifmw_openshift_context | default(omit)}}, field_selectors=['status.phase=Running']] ***
2024-11-19 21:33:14.423909 | controller | Tuesday 19 November 2024  21:33:13 -0500 (0:00:00.979)       0:06:26.851 ******
2024-11-19 21:33:14.423936 | controller | ok: [localhost]
2024-11-19 21:33:14.541829 | controller |
2024-11-19 21:33:14.541895 | controller | TASK [env_op_images : Retrieve openstack-operator-index pod cifmw_install_yamls_vars_content={'OPENSTACK_IMG': '{{ selected_pod.status.containerStatuses[0].imageID }}'}] ***
2024-11-19 21:33:14.541904 | controller | Tuesday 19 November 2024  21:33:14 -0500 (0:00:01.392)       0:06:28.244 ******
2024-11-19 21:33:14.541922 | controller | ok: [localhost]
2024-11-19 21:33:15.059069 | controller |
2024-11-19 21:33:15.059158 | controller | TASK [env_op_images : Get operator images and pods cifmw_openstack_operator_images_content={'RABBITMQ_OP_IMG': '{{ selected_pod.status.containerStatuses[0].imageID }}'}, selected_pods={{ pod_list.resources | rejectattr('metadata.generateName', 'contains', 'openstack-operator-index-') | rejectattr('metadata.generateName', 'contains', 'rabbitmq-cluster-operator-') }}] ***
2024-11-19 21:33:15.059176 | controller | Tuesday 19 November 2024  21:33:14 -0500 (0:00:00.118)       0:06:28.363 ******
2024-11-19 21:33:15.059203 | controller | ok: [localhost]
2024-11-20 02:33:15.875150 | controller | ERROR
2024-11-20 02:33:15.875334 | controller | {
2024-11-20 02:33:15.875371 | controller |   "delta": "0:06:30.193468",
2024-11-20 02:33:15.875397 | controller |   "end": "2024-11-19 21:33:15.839534",
2024-11-20 02:33:15.875418 | controller |   "msg": "non-zero return code",
2024-11-20 02:33:15.875442 | controller |   "rc": 2,
2024-11-20 02:33:15.875463 | controller |   "start": "2024-11-19 21:26:45.646066"
2024-11-20 02:33:15.875484 | controller | }

@raukadah
Copy link
Contributor Author

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/c50bf6c9f2564aff8a76d7abd7f5feac

✔️ openstack-meta-content-provider SUCCESS in 1h 21m 12s
watcher-operator-validation POST_FAILURE in 1h 10m 59s

@raukadah
Copy link
Contributor Author

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/6cecd21e4fb64c9d91d7278bae91a3e6

✔️ openstack-meta-content-provider SUCCESS in 1h 20m 51s
watcher-operator-validation POST_FAILURE in 1h 10m 00s

@raukadah
Copy link
Contributor Author

raukadah commented Nov 25, 2024

Still failing: https://softwarefactory-project.io/zuul/t/rdoproject.org/build/71f4b6ecc21d401396c976c63c671e29/console#10/0/6/controller for watcher-operator-index pod and here is the list of other pods.

TASK [env_op_images : Add operator images to the dictionary cifmw_openstack_operator_images_content={{
  cifmw_openstack_operator_images_content |
  combine(
    {
      item.metadata.labels['openstack.org/operator-name'] | upper ~ '_OP_IMG': item.status.containerStatuses[1].imageID
    }
  )
}}] ***
Monday 25 November 2024  04:30:28 -0500 (0:00:00.505)       0:06:09.265 ******* 
ok: [localhost] => (item=barbican-operator-controller-manager-f85c5b57d-dvp7j)
ok: [localhost] => (item=cinder-operator-controller-manager-749bcd5b5c-dfg2s)
ok: [localhost] => (item=designate-operator-controller-manager-748dd946ff-gv8z2)
ok: [localhost] => (item=glance-operator-controller-manager-68f67495dd-t26qw)
ok: [localhost] => (item=heat-operator-controller-manager-5574b6fbb8-96ffv)
ok: [localhost] => (item=horizon-operator-controller-manager-5dfcf764-9zl79)
ok: [localhost] => (item=infra-operator-controller-manager-85984f64f4-mzkrr)
ok: [localhost] => (item=ironic-operator-controller-manager-6f98bdc7fc-vbw6p)
ok: [localhost] => (item=keystone-operator-controller-manager-54c4469879-rrxx8)
ok: [localhost] => (item=manila-operator-controller-manager-8669b45855-p8q74)
ok: [localhost] => (item=mariadb-operator-controller-manager-5d47dff9fb-bl5nb)
ok: [localhost] => (item=neutron-operator-controller-manager-59798496bb-9xs9j)
ok: [localhost] => (item=nova-operator-controller-manager-b48d786f6-cmp6n)
ok: [localhost] => (item=octavia-operator-controller-manager-7657bb976c-tqkg8)
ok: [localhost] => (item=openstack-baremetal-operator-controller-manager-7899c74d5bwjv8g)
ok: [localhost] => (item=openstack-operator-controller-manager-675674d46c-m62p5)
ok: [localhost] => (item=ovn-operator-controller-manager-5d7c9dbcbf-r7w45)
ok: [localhost] => (item=placement-operator-controller-manager-6b4d546564-k74k2)
ok: [localhost] => (item=swift-operator-controller-manager-f779d8f95-hh45j)
ok: [localhost] => (item=telemetry-operator-controller-manager-8d54b4fb-rh7s8)
ok: [localhost] => (item=watcher-operator-controller-manager-5fc97cc4d-wmxsp)
fatal: [localhost]: FAILED! => 
  msg: |-
    The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'openstack.org/operator-name'. 'dict object' has no attribute 'openstack.org/operator-name'
  
    The error appears to be in '/home/zuul/src/github.com/openstack-k8s-operators/ci-framework/roles/env_op_images/tasks/main.yml': line 125, column 7, but may
    be elsewhere in the file depending on the exact syntax problem.
  
    The offending line appears to be:
  
  
        - name: Add operator images to the dictionary
          ^ here

raukadah added a commit to openstack-k8s-operators/ci-framework that referenced this pull request Nov 25, 2024
watcher-operator is going to shipped as a standalone operator.
It is going to be installed via olm seperatly from index image in openstack-operators
namespace.

when env_ops_images creates operator_images dictionary, it goes over all
the pods listed under openstack-operators namespace with label openstack.org/operator-name
which does not exists for watcher-operator-index- pod. It fails with
following error[1]:
```
The task includes an option with an undefined variable.
The error was: 'dict object' has no attribute 'openstack.org/operator-name'. 'dict object' has no attribute 'openstack.org/operator-name'
```

This pr excluded the watcher-operator-index- pod to fix the issue.

Links:
[1]. openstack-k8s-operators/watcher-operator#9 (comment)

Signed-off-by: Chandan Kumar <[email protected]>
@raukadah
Copy link
Contributor Author

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/12358ef8df374f29aa8629913ca63fef

✔️ openstack-meta-content-provider SUCCESS in 1h 39m 36s
watcher-operator-validation POST_FAILURE in 1h 17m 12s

raukadah added a commit to openstack-k8s-operators/ci-framework that referenced this pull request Nov 25, 2024
watcher-operator is going to shipped as a standalone operator.
It is going to be installed via olm seperatly from index image in openstack-operators
namespace.

when env_ops_images creates operator_images dictionary, it goes over all
the pods listed under openstack-operators namespace with label openstack.org/operator-name
which does not exists for watcher-operator-index- pod. It fails with
following error[1]:
```
The task includes an option with an undefined variable.
The error was: 'dict object' has no attribute 'openstack.org/operator-name'. 'dict object' has no attribute 'openstack.org/operator-name'
```

This pr excluded the watcher-operator-index- pod to fix the issue.

Links:
[1]. openstack-k8s-operators/watcher-operator#9 (comment)

Signed-off-by: Chandan Kumar <[email protected]>
@raukadah
Copy link
Contributor Author

recheck

@raukadah
Copy link
Contributor Author

Below are the test results from the job:

Logs of watcher pod deployment

pod/watcher-operator-controller-manager-75575d9968-bqkwp              2/2     Running     0               5m27s
pod/watcher-operator-index-7mgbc                                      1/1     Running     0               7m5s

And operator image used from content provider

Image:         38.102.83.166:5001/openstack-k8s-operators/watcher-operator:f1d23442dc3002eeee95656312b2f0de8cb71e76
    Image ID:      38.102.83.166:5001/openstack-k8s-operators/watcher-operator@sha256:da1b7b30f88311ae47d24001a9487340e45eeb714f06907026662b27e7b26b0d

@raukadah raukadah marked this pull request as ready for review November 26, 2024 05:36
@openshift-ci openshift-ci bot requested review from abays and SeanMooney November 26, 2024 05:36
This pr:
- Adds watcher-operator-base job from
  podified-multinode-edpm-deployment-crc-2comp parent. This job will
  deploy 2 node EDPM deployment and then deploy watcher operator using
  make targets from watcher-operator repo.
- It adds hook to deploy watcher service via ci-framework hook.

Depends-On: openstack-k8s-operators#8

Signed-off-by: Chandan Kumar <[email protected]>
@raukadah raukadah requested a review from amoralej November 26, 2024 06:40
openshift-merge-bot bot pushed a commit to openstack-k8s-operators/ci-framework that referenced this pull request Nov 26, 2024
watcher-operator is going to shipped as a standalone operator.
It is going to be installed via olm seperatly from index image in openstack-operators
namespace.

when env_ops_images creates operator_images dictionary, it goes over all
the pods listed under openstack-operators namespace with label openstack.org/operator-name
which does not exists for watcher-operator-index- pod. It fails with
following error[1]:
```
The task includes an option with an undefined variable.
The error was: 'dict object' has no attribute 'openstack.org/operator-name'. 'dict object' has no attribute 'openstack.org/operator-name'
```

This pr excluded the watcher-operator-index- pod to fix the issue.

Links:
[1]. openstack-k8s-operators/watcher-operator#9 (comment)

Signed-off-by: Chandan Kumar <[email protected]>
Copy link

@marios marios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

minor inline comment for consideration and question about actually validating (but for now we should just get this running and add tempest later?)

- watcher-operator-validation

- job:
name: watcher-operator-base
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it should never run itself then you could also add abstract but not worth blocking for

cifmw.general.ci_script:
output_dir: "{{ cifmw_basedir }}/artifacts"
chdir: "{{ ansible_user_dir }}/src/github.com/openstack-k8s-operators/watcher-operator"
script: make watcher_deploy
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so with this we install but don't actually validate the services yet. so I guess ultimately the plan is to make run_tempest: true ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, correct! We will extend this job to run basic watcher tempest plugin api tests.

Copy link

openshift-ci bot commented Nov 26, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: marios

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 7ec0729 into openstack-k8s-operators:main Nov 26, 2024
6 checks passed
@amoralej
Copy link
Contributor

Below are the test results from the job:

* [Deploy Watcher Service hook log](https://logserver.rdoproject.org/9/9/f1d23442dc3002eeee95656312b2f0de8cb71e76/github-check/watcher-operator-validation/c6264a1/controller/ci-framework-data/logs/ci_script_010_run_deploy_watcher.log) which calls `make watcher` and `make watcher_deploy`.
  
  * [logs of `make watcher`](https://logserver.rdoproject.org/9/9/f1d23442dc3002eeee95656312b2f0de8cb71e76/github-check/watcher-operator-validation/c6264a1/controller/ci-framework-data/logs/ci_script_011_install_watcher.log)
  * [logs of `make watcher_deploy`](https://logserver.rdoproject.org/9/9/f1d23442dc3002eeee95656312b2f0de8cb71e76/github-check/watcher-operator-validation/c6264a1/controller/ci-framework-data/logs/ci_script_012_deploy_watcher.log)

Logs of watcher pod deployment

pod/watcher-operator-controller-manager-75575d9968-bqkwp              2/2     Running     0               5m27s
pod/watcher-operator-index-7mgbc                                      1/1     Running     0               7m5s

And operator image used from content provider

Image:         38.102.83.166:5001/openstack-k8s-operators/watcher-operator:f1d23442dc3002eeee95656312b2f0de8cb71e76
    Image ID:      38.102.83.166:5001/openstack-k8s-operators/watcher-operator@sha256:da1b7b30f88311ae47d24001a9487340e45eeb714f06907026662b27e7b26b0d

Thanks for the links and the clear explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants