-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nova adoption ffu (no extra cell) #192
Conversation
d591e29
to
453c6ad
Compare
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/b03024da30604a15b1a498c1491770fc ❌ data-plane-adoption-github-rdo-centos-9-extracted-crc FAILURE in 1h 36m 39s |
1dcded2
to
d6baec4
Compare
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/473fd3e628504e53ba1dff4e0408cf1f ❌ data-plane-adoption-github-rdo-centos-9-extracted-crc FAILURE in 2h 01m 27s |
3ba8a12
to
3b256aa
Compare
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/375347d7b9ce40e48c615ac4c6d53b02 ❌ data-plane-adoption-github-rdo-centos-9-extracted-crc FAILURE in 2h 02m 23s |
626671d
to
01af10f
Compare
this has been tested on my env, let's fix CI... |
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/a6888a92072c4e3595c31b7cb9da9a14 ❌ data-plane-adoption-github-rdo-centos-9-extracted-crc FAILURE in 2h 21m 12s |
01af10f
to
eb31157
Compare
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/02bd8c038bb94cefa57fee73bcd432a9 ❌ data-plane-adoption-github-rdo-centos-9-extracted-crc FAILURE in 2h 03m 56s |
eb31157
to
180fb86
Compare
180fb86
to
67fec82
Compare
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/836fbc4237bd4aadbaa4a202b402eb41 ❌ data-plane-adoption-github-rdo-centos-9-extracted-crc FAILURE in 2h 33m 34s |
The adopted compute did not reported a status as the compute version is still wallaby.
I don't know where are the nova-compute logs in CI during the adoption testing. :/ So I only have indirect checks The execution log of the nova edpm role also look clean: https://logserver.rdoproject.org/92/192/67fec82340574efbfb56d234e7bf3a680888939f/github-check/data-plane-adoption-github-rdo-centos-9-extracted-crc/df8d241/controller/ci-framework-data/logs/quay-io-openstack-k8s-operators-openstack-must-gather-sha256-0812f031e363406238d47a3ba3cfb33412e9a2d143d2b4b9365c9796b80bb8aa/namespaces/openstack/pods/nova.compute.extraconfig-openstack-tgdjr/logs/openstackansibleee.log Without the compute logs it is hard to tell why the compute does not come up :/ |
recheck |
@gibizer this is what we have for the logs so far: |
In case of an greenfield job ci-framework takes bunch of logs from the computes. It would be good to reuse that logic somehow here in the adoption jobs |
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/a67f04155be7487d83ab6966f60165db ❌ data-plane-adoption-github-rdo-centos-9-extracted-crc FAILURE in 38m 16s |
recheck PS2 |
recheck |
recheck PS2 |
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/19bff67041a54aec92db93b6edba241f ❌ data-plane-adoption-github-rdo-centos-9-extracted-crc FAILURE in 40m 51s |
recheck ps3 |
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/1f6eba7722c64b7d938462bd36b211b9 ❌ data-plane-adoption-github-rdo-centos-9-extracted-crc FAILURE in 41m 21s |
recheck ps5 |
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/c5afcfcf9d6f4683883dc742d95000c2 ❌ data-plane-adoption-github-rdo-centos-9-extracted-crc FAILURE in 2h 16m 59s |
8083b4a
to
93526b8
Compare
post-CI run logs extraction looks broken here https://github.com/openstack-k8s-operators/ci-framework/blob/main/roles/artifacts/tasks/edpm.yml#L44 @cjeanner fyi |
This change depends on a change that failed to merge. Change https://review.rdoproject.org/r/c/rdo-jobs/+/50917 is needed. |
recheck |
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/47c5b94f603744e782dc849df3b27702 ❌ data-plane-adoption-github-rdo-centos-9-extracted-crc FAILURE in 2h 05m 01s |
This comment was marked as outdated.
This comment was marked as outdated.
recheck |
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/ad32854f5dac47dda1d1c1f197a2485a ❌ data-plane-adoption-github-rdo-centos-9-extracted-crc FAILURE in 2h 26m 58s |
Failed due to known issues with isolnet configs (no rabbitmq connectivity) https://logserver.rdoproject.org/92/192/93526b893a7944fe8af2e158446cae8477dac1c6/github-check/data-plane-adoption-github-rdo-centos-9-extracted-crc/db7b697/standalone/containers/nova/nova-compute.log |
recheck we saw https://review.rdoproject.org/r/c/rdo-jobs/+/50917 make this PR pass CI. |
Update EDPM adoption docs and tests to execute Nova compute post-FFU. For that, deploy an additional nova-compute-ffu EDPM service and patch openstack control plane CR for nova services. Because of the different lifecycle management tooling used for both actions, orchestrate FFU w/o a lock-step between nova compute EDPM and podified control plane services. Signed-off-by: Bohdan Dobrelia <[email protected]>
93526b8
to
1ff183a
Compare
This change depends on a change that failed to merge. Change https://review.rdoproject.org/r/c/rdo-jobs/+/50917 is needed. |
recheck rebased the dependency |
{{ mariadb_copy_shell_vars }} | ||
oc rsh mariadb-openstack-cell1 mysql --user=root --password=${PODIFIED_DB_ROOT_PASSWORD} \ | ||
-e "select a.version from nova_cell1.services a join nova_cell1.services b where a.version!=b.version and a.binary='nova-compute';" | ||
register: records_check_results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2023-12-04 10:32:38.433 1 DEBUG oslo_db.sqlalchemy.engines [None req-76342adf-c98e-4a74-91f9-71e7e86e87f2 - - - - - -] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION _check_effective_sql_mode /usr/lib/python3.9/site-packages/oslo_db/sqlalchemy/engines.py:335�[00m
2023-12-04 10:32:38.434 1 ERROR nova.context [None req-76342adf-c98e-4a74-91f9-71e7e86e87f2 - - - - - -] Error gathering result from cell 292fe7d7-f10c-4546-876d-753875e67b77: sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (1045, "Access denied for user 'nova_cell1'@'192.168.122.100' (using password: YES)")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conductor log shows that the issue is earlier than the compute adoption. Show the new k8s control plane has a wrong / incomplete DB setup as the conductor cannot talk to its DB. Wondering how the db sync on that same DB was run successfully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Nova CR status is Ready so there was a succesfully db sync run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the logs and dumps but I don't see why the cell1 conductor cannot connect to the DB. Unfortunately all the passwords are masked in must gather so I cannot check those. @marios if you have a held node with this issue then I can check the creds there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gibizer thanks for having a look sorry we were trying to get to a solution and missed your comments. in the end it was a dns issue resolved with https://github.com/openstack-k8s-operators/data-plane-adoption/pull/218/files
green run there if you want to poke at logs https://review.rdoproject.org/zuul/build/87df5976f8814ea9a319eea1caececb2/artifacts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The green result nova-compute logs looks good to me!
Extracted nova services FFU step being done from #176
Update EDPM adoption docs and tests to execute Nova compute post-FFU.
For that, deploy an additional nova-compute-ffu EDPM service and
patch openstack control plane CR for nova services.
Because of the different lifecycle management tooling used for both
actions, orchestrate FFU w/o a lock-step between nova compute EDPM
and podified control plane services.
Jira: OSPRH-338
Depends-on: https://review.rdoproject.org/r/c/rdo-jobs/+/50917