Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add role to deploy 17.1 env for adoption #2297

Merged
merged 2 commits into from
Oct 14, 2024
Merged

Conversation

cescgina
Copy link
Contributor

@cescgina cescgina commented Sep 3, 2024

Add a new role that will deploy a tripleo environment that will serve as
source for adoption. This role is expected to cosume the infra created
by [1], and a 17.1 scenario definition from the data-plane-adoption
repo, introduced by [2].

It also introduce a small fix to the deploy-ocp.yml so the resulting ocp
cluster is ready (the nodes needed to be uncordoned).

[1] #2285
[2] openstack-k8s-operators/data-plane-adoption#597

Copy link

github-actions bot commented Sep 3, 2024

Thanks for the PR! ❤️
I'm marking it as a draft, once your happy with it merging and the PR is passing CI, click the "Ready for review" button below.

@github-actions github-actions bot marked this pull request as draft September 3, 2024 13:31
Copy link
Collaborator

@cjeanner cjeanner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would need some more flexibility, but it could match the need with some tweaks/iterations.

roles/adoption_osp_deploy/files/undercloud.conf Outdated Show resolved Hide resolved
roles/adoption_osp_deploy/tasks/deploy_ceph.yml Outdated Show resolved Hide resolved
roles/adoption_osp_deploy/tasks/deploy_ceph.yml Outdated Show resolved Hide resolved
roles/adoption_osp_deploy/tasks/overcloud_deploy.yml Outdated Show resolved Hide resolved
roles/adoption_osp_deploy/tasks/main.yml Outdated Show resolved Hide resolved
roles/adoption_osp_deploy/tasks/main.yml Outdated Show resolved Hide resolved
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/3e0801ce95b34d3184964958b146d59f

✔️ openstack-k8s-operators-content-provider SUCCESS in 10h 11m 04s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 14m 43s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 25m 40s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 7m 23s
cifmw-pod-pre-commit FAILURE in 5m 39s
✔️ build-push-container-cifmw-client SUCCESS in 37m 19s

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/f96d929a98b34e69a20a3813f6e9f506

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 14m 09s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 12m 43s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 28m 44s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 7m 44s
cifmw-pod-pre-commit FAILURE in 5m 37s
✔️ build-push-container-cifmw-client SUCCESS in 37m 04s

@cescgina cescgina force-pushed the adoption_osp_role branch 2 times, most recently from ae14a9b to cd97aff Compare September 5, 2024 15:57
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/179a53575c8e437a909be2c41711f18c

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 33m 15s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 14m 26s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 23m 39s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 7m 39s
cifmw-pod-pre-commit FAILURE in 5m 40s
✔️ build-push-container-cifmw-client SUCCESS in 37m 40s

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/100b262aeb704c85b86c6589151688bf

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 46m 28s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 18m 33s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 26m 06s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 7m 44s
cifmw-pod-pre-commit FAILURE in 5m 25s
✔️ build-push-container-cifmw-client SUCCESS in 37m 00s

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/11d5100f2b374d3d875c5cd958cdfef7

✔️ openstack-k8s-operators-content-provider SUCCESS in 34m 34s
podified-multinode-edpm-deployment-crc FAILURE in 17m 05s
cifmw-crc-podified-edpm-baremetal FAILURE in 21m 52s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 7m 33s
cifmw-pod-pre-commit FAILURE in 6m 20s
✔️ build-push-container-cifmw-client SUCCESS in 22m 29s

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/c8f71eecadfe40639c4e22942a5fb6c5

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 43m 57s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 14m 39s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 31m 01s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 7m 50s
cifmw-pod-pre-commit FAILURE in 5m 37s
✔️ build-push-container-cifmw-client SUCCESS in 20m 46s

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/c22a1ed7df85440d83b1bb6fc81d2aa0

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 40m 24s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 16m 26s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 25m 59s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 17s
cifmw-pod-pre-commit FAILURE in 7m 40s
✔️ build-push-container-cifmw-client SUCCESS in 30m 23s

deploy-osp-adoption.yml Outdated Show resolved Hide resolved
deploy-osp-adoption.yml Outdated Show resolved Hide resolved
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/2c06710e48894922925227767bc3a40b

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 02m 30s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 17m 43s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 19m 04s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 37s
✔️ cifmw-pod-pre-commit SUCCESS in 7m 27s
build-push-container-cifmw-client TIMED_OUT in 1h 30m 37s

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/782d8ffaf82c413cab42bc91c2a5128d

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 50m 00s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 16m 23s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 29m 49s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 7m 39s
cifmw-pod-pre-commit FAILURE in 7m 11s
✔️ build-push-container-cifmw-client SUCCESS in 31m 20s

@cescgina cescgina force-pushed the adoption_osp_role branch 2 times, most recently from e99f3b5 to 74423c1 Compare September 26, 2024 15:58
@lewisdenny lewisdenny requested a review from pablintino October 9, 2024 06:13
@cescgina
Copy link
Contributor Author

cescgina commented Oct 9, 2024

@lewisdenny thanks for the extensive review, I've updated the PR with some changes, please take a look when you have some time

Copy link
Collaborator

@pablintino pablintino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did my best to review this one (not finished yet), but with this size is almost impossible to do a proper review.
Most of my comments are cosmetic ones, but I'm writing down them mainly because I'm sure that no one will have time to fix them when this PR is merged.

loop_var: "_inventory_file"

- name: Gather ansible_user_id from controller for hooks
ansible.builtin.setup:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This task name Gather ansible_user_id from controller for hooks is misleading. The ansible.builtin.setup will target the playbook target, as it's not delegated, meaning it will collect the facts you ask to collect from localhost only. If you want to collect the facts from the controller you need to delegate the task to that host.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed the task name, it was doing the right thing, but the description was wrong

Copy link
Contributor

@bogdando bogdando Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am late to the party, albeit my question is likely related to the ci framework design, not this implementation that follows it. Why do we need to make ansible parsing other ansible/t-h-t objects from yaml files, instead of using direct inputs?

cifmw_adoption_osp_deploy_scenario:
{{ cifmw_adoption_osp_deploy_scenario | to_nice_yaml(indent=2) | indent(width=2) }}
_vm_groups:
{{ _vm_groups | to_nice_yaml(indent=2) | indent(width=2) }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You won't need to deal with indentation if the file is written the proper way:

  - name: Save variables for use with hooks
    vars:
      __content:
        cifmw_adoption_osp_deploy_ntp_server: "{{ cifmw_adoption_osp_deploy_ntp_server }}"
        cifmw_adoption_source_scenario_path: "{{ cifmw_adoption_source_scenario_path }}"
        cifmw_adoption_osp_deploy_scenario: "{{ cifmw_adoption_osp_deploy_scenario }}"
        _vm_groups: "{{ _vm_groups }}"
    ansible.builtin.copy:
      dest: "{{ cifmw_basedir }}/artifacts/parameters/adoption_osp.yml"
      content: "{{ __content | to_nice_yaml }}"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, that is a great improvement!

_stack.network_data_file
] | path_join
}}
_network_data_file_dest: "{{ ansible_user_dir }}/network_data_{{ _overcloud_name }}.yaml"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous line we use path_join. I'd prefer to be consistent, especially in code that belongs to the same PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks

_path_inventories: "{{ ansible_user_dir }}/ci-framework-data/reproducer-inventory"
ansible.builtin.slurp:
path: "{{ _path_inventories }}/{{ _group_name }}-group.yml"
register: _files
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meaningless variable name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I made the name more explicit

vars:
_content: "{{ _inventory_file.content | b64decode | from_yaml }}"
_group_name: "{{ _inventory_file.source | basename | regex_replace('-group.yml', '') }}"
_nodes: "{{ _content[_group_name~'s']['hosts'].keys() }}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the groups will be always named _group_name + 's' looks fragile. One small change in the reproducer inventory dump and we will miss that adoption is making this assumption.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed, that is error-prone and turns out the assumption is not needed, I've changed the code to simply read the group name from the file

block:
- name: Copy ceph osd file
delegate_to: "osp-undercloud-0"
ansible.builtin.copy:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File manipulations without explicitly setting the mode will cause linter issues. iirc it's one of the rules we disabled, so no problem for now, but at some point someone will need to touch this freshly added code to make it complaint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, I haven't thought about this, I think I added the mode to all the tasks now

_config_download: >
{{
_original_config_download['content'] |
b64decode | from_yaml
}}
_new_config_download_fields: {}
_ctlplane_net: "{{ cifmw_networking_env_definition.networks.ctlplane }}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the *download* vars are not used till the very end of the block in one single task. Care to move them to there? I was searching where are they used all over the file...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, moved them all to the end

Comment on lines 52 to 71
- name: Gather overcloud tripleo nodes
when: group.key is in _hostname_map_translation
ansible.builtin.set_fact:
_tripleo_nodes: >-
{{
_tripleo_nodes | default([]) +
group.value
}}
loop: "{{ _vm_groups | dict2items }}"
loop_control:
loop_var: group
label: "{{ group.key }}"

- name: Generate Hostnamemap field
when: group.key is in _hostname_map_translation
vars:
_tripleo_name: "{{ _hostname_map_translation[group.key] }}"
_group_nodes: >-
{%- set hosts = {} -%}
{%- for node in group.value -%}
{%- set key = _tripleo_name ~ '-' ~ loop.index0 | string -%}
{%- set _ = hosts.update({key: node}) -%}
{%- endfor -%}
{{ hosts }}
ansible.builtin.set_fact:
_hostname_map: >-
{{
_hostname_map | default({}) |
combine(_group_nodes)
}}
loop: "{{ _vm_groups | dict2items }}"
loop_control:
loop_var: group
label: "{{ group.key }}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both loops can be perfectly joined and the second one doesn't depend in the computation done in the first one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed, thanks, I've merged them

args:
apply:
delegate_to: "osp-undercloud-0"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost all the following tasks are delegated. Why not use a block to improve readability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks!

@jistr
Copy link
Contributor

jistr commented Oct 11, 2024

There are already detailed reviews so i just did a higher level "let me grasp what's going on" review.

I just wondered whether we'll want to support deployment without subscription-manager (with rhos-release), but that's yet another potential enhancement that should probably not be included in this PR anymore :)

Thank you very much for the work on this. LGTM 👍

@lewisdenny lewisdenny self-requested a review October 13, 2024 22:51
@lewisdenny
Copy link
Collaborator

I will take another pass after @pablintino review is addressed

Add the following improvements from reviews:
- Allow network_data.yaml to be a template
- Build inventory for ceph migration in tripleo-ansible-inventory
- Generate adoption_vars to use with the adoption_tests
- Allow users to pass an overcloud_deploy command
- Support deploying multiple stacks
- Support not passing cifmw_target_host
- Add validation to scenario input
- Add role var for the 17.1 scenario
@cescgina
Copy link
Contributor Author

@pablintino thanks for the review, I think I've addressed all your comments @lewisdenny I've updated the PR with the changes after testing locally, ptal when you have some time

@pablintino
Copy link
Collaborator

/approve

Copy link
Contributor

openshift-ci bot commented Oct 14, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pablintino

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@frenzyfriday
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Oct 14, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit 9d26487 into main Oct 14, 2024
5 checks passed
@openshift-merge-bot openshift-merge-bot bot deleted the adoption_osp_role branch October 14, 2024 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants