Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 1.6.0 #328

Merged
merged 4 commits into from
Sep 2, 2024
Merged

Release 1.6.0 #328

merged 4 commits into from
Sep 2, 2024

Conversation

justinc1
Copy link
Collaborator

@justinc1 justinc1 commented Aug 26, 2024

@justinc1 justinc1 requested a review from shoriminimoe August 26, 2024 07:31
@shoriminimoe
Copy link
Collaborator

I reviewed the integration tests for this PR.

  • The integ-run / ansible-test (time_server) test failed for all clusters due to a 400 error:

    fatal: [testhost]: FAILED! => {"changed": false, "msg": "Unexpected response - 400 b'{\"error\":\"10.5.11.5 is not a valid NTP server\"}'"}
    

    This appears to be a test setup problem. Maybe the target NTP server is having problems?

  • The integ-run / ansible-test (inventory) test failed for all clusters due to this assertion error:

    fatal: [localhost]: FAILED! => {
      "assertion": "{{ 'ci-inventory-vm0' in hostvars }}",
      "changed": false,
      "evaluated_to": false,
      "msg": "Assertion failed"
    }
    

    That appears to be a feature problem. Has the inventory feature been fully implemented?

  • The integ-run / ansible-test (git_issue_41) test failed for all clusters due to what appears to be a templating issue:

    Warning: : conditional statements should not include jinja2 templating
    delimiters such as {{ }} or {% %}. Found: "{{
    vm_created.record.0.disks.2.iso_name }}" == "cloud-init-{{
    (vm_created.record.0.uuid | split("-")) [0] }}.iso"
    fatal: [testhost]: FAILED! => {"msg": "The conditional check '\"{{ vm_created.record.0.disks.2.iso_name }}\" == \"cloud-init-{{ (vm_created.record.0.uuid | split(\"-\")) [0] }}.iso\"' failed. The error was: Conditional is marked as unsafe, and cannot be evaluated."}
    
  • The integ-seq-run / ansible-test (utils_login) failed for all clusters due to an auth failure:

    TASK [utils_login : Get HC3 ping - OIDC user, from environ] ********************
    fatal: [testhost]: FAILED! => {"changed": false, "msg": "Failed to authenticate with the instance: 401 Unauthorized"}
    

    This looks like a bug to me. Could be in either the collection or HyperCore itself. It could also be a test problem with the OIDC server and/or the test environment.

  • I'm not sure what is causing the failure for OIDC config on 10.5.11.201. The run looks like it recovered after retries, but ultimately resulted in a failure.

  • There was a parsing error in the workflow file for the run. Despite the error, the tests still ran. I filed 🐞 CI Bug: Parsing error in integ-test workflow #329 for this.

@justinc1 can you confirm these findings?

@shoriminimoe
Copy link
Collaborator

@justinc1 I replaced the 9.4.0 VSNS with the new 9.4.17 VSNS in da6c864. That commit is built on top of this PR. I believe it captures all of the places that need it. I also have an integration test for that version: https://github.com/ScaleComputing/HyperCoreAnsibleCollection/actions/runs/10582507411.

Do you want to cherry-pick it here or should we submit it in a separate PR?

@justinc1
Copy link
Collaborator Author

justinc1 commented Aug 27, 2024

10.5.11.5 is not a valid NTP server

This was IP of github CI runner VM; we reuse it also to run Ci NTP server. The runner had problems, during debugging I rebooted it, and it got new DHCP IP. The problem runner had - it was running out of disk "space" (but actually all inodes were used, not space). For now, I will just change the runner VM to static IP address 10.5.11.5. Maybe I need to ask Alex to reserve this (or some other IP).

Fix #331

integ-run / ansible-test (inventory)

This worked before. I did update docker image we use for testing after 1.5.0, so we test with latest ansible-core, ansible-lint etc. Looks like we need to update our code (or, ansible-core might have regression).

Fix - wanted to do this in #334, but I'm not able to reproduce. I even run ansible-test integration from quay.io/justinc1_github/scale_ci_integ:10 docker container, to use exactly the same binaries. But I got only jinja2 warnings. Maybe some other test removed VMs that inventory test expects?

integ-run / ansible-test (git_issue_41)

'\"{{ vm_created.record.0.disks.2.iso_name }}\" == \"cloud-init-{{ (vm_created.record.0.uuid | split(\"-\")) [0] }}.iso\"' needs to be something like vm_created.record.0.disks.2.iso_name == "cloud-init-" + (vm_created.record.0.uuid | split("-") | first) + ".iso"

Fix #332

integ-seq-run / ansible-test (utils_login)

Maybe password expired; needs to be changed in Azure.

Fix #331, we only needed to update password for test user.

I'm not sure what is causing the failure for OIDC config on 10.5.11.201

During OIDC reconfiguration HC3 API (frequently) returns 500 error, collection has extra code to retry in this case - this warning is show https://github.com/ScaleComputing/HyperCoreAnsibleCollection/actions/runs/10554972693/job/29264590244#step:7:93. To make sure this works, the integ test runs 10 reconfigurations in loop, to check if there is some other error.

This time it failed as API returned HTTP text instead of json - https://github.com/ScaleComputing/HyperCoreAnsibleCollection/actions/runs/10554972693/job/29264590244#step:7:105. I think the line updated_oidc = Oidc.get(rest_client) needs to be moved into try/except.

Fix #333

There was a parsing error in the workflow file for the run - #329

Yes, this is my fault :/ After some CI workflow refactoring I noticed to late this error/warning. I will try something replace test_name: ${{ fromJson(inputs.test_names) }} with test_name: ${{ fromJson(inputs.test_names or "[]") }}.

Fix TODO

replaced the 9.4.0 VSNS with the new 9.4.17 VSNS in da6c864. Do you want to cherry-pick it here or should we submit it in a separate PR?

Fix by Sam, #330

A separate PR. I will also have a few small PRs, for the errors above. And thank you @shoriminimoe .

@shoriminimoe
Copy link
Collaborator

During OIDC reconfiguration HC3 API (frequently) returns 500 error

I will make sure we have a tracking item for this internally. The API should be behaving that poorly.


I created #330 to merge the CI changes. Unless it needs changes, feel free to merge it if I'm offline.

Signed-off-by: Justin Cinkelj <[email protected]>
Signed-off-by: Justin Cinkelj <[email protected]>
@justinc1 justinc1 merged commit 9004abb into main Sep 2, 2024
4 checks passed
@justinc1 justinc1 deleted the release-1.6.0 branch September 2, 2024 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants