Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support bootc #830

Draft
wants to merge 11 commits into
base: bootc
Choose a base branch
from
Draft

Conversation

bshephar
Copy link
Contributor

This PR adds a number of changes to roles in order to facilitate the use of image mode RHEL.

Copy link
Contributor

openshift-ci bot commented Nov 27, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@@ -36,3 +36,4 @@ edpm_nova_compute_config_dir: /var/lib/config-data/ansible-generated/nova_libvir

# KSM control
edpm_kernel_enable_ksm: false
edpm_use_bootc: false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need a better way to implement this globally. But at least for testing purposes, this is what I've used to get something that deploys.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why I added the edpm_bootc role in #813 so that we had a way to do it consistently across anywhere that needs it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I created a new bootc branch for edpm-ansible: https://github.com/openstack-k8s-operators/edpm-ansible/tree/bootc

Can you propose this PR to the bootc branch instead?

I'll be reverting #813 from main until we are ready to merge all bootc support into main.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Branch thing is done. But that role would need to be called from each and every playbook to detect and set the bootc variable right? I guess we can just add it as a ansibleVar and avoid calling the role each and every time we start a new service.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we can use a custom fact. For example:

cat /etc/ansible/facts.d/bootc.fact
#!/usr/bin/env bash

is_bootc() {
  BOOTC_STATUS=$(sudo bootc status --json | jq .status.type)
  if [[ "$BOOTC_STATUS" == \"bootcHost\" ]]; then
     BOOTC_SYSTEM="true"
  else
     BOOTC_SYSTEM="false"
  fi
}

is_bootc

echo ${BOOTC_SYSTEM}

This is good from the perspective of not needing the user to manually define that they are using a bootc system, plus it works for our non bootc systems:

[m3@osp-df-3 bootc]$ ansible -i inv.yaml all -m setup -a "filter=ansible_local"
edpm-compute-1 | SUCCESS => {
    "ansible_facts": {
        "ansible_local": {
            "bootc": true
        },
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": false
}

So you could combine them in the same NodeSet if you wanted to. But, the down side of this approach is that we need to gather facts from each service. We have thus far tried to limit the amount of fact gathering required, so this approach my not be what we want to do without some more granular control of which facts are being gathered in each service. At the moment, we just define a variable for gather_facts. If that variable is true, then we gather all facts. Obviously, that becomes necessary if we want to allow individual task executions that require facts, but when we want to just gather local facts, then gathering all of them introduces non-trivial time to our executions of each service.

Offering it as a potential solution that we can debate. The alternative is that we require either bootc or non-bootc nodes in each NodeSet.

- name: Push script
ansible.builtin.copy:
dest: /usr/local/sbin/containers-tmpwatch
dest: /var/lib/openstack/cron/containers-tmpwatch
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/usr is immutable with bootc deployments. So I've proposed doing this in two different ways. 1, we bake the scripts into the container file:
https://github.com/openstack-k8s-operators/install_yamls/pull/950/files#diff-f8fb9af5355b45b9ca8936bf0d721c6f0e37e13b637f5598e2be19995dea23e7R45-R46

And 2. Which is this method of writing to /var/lib/openstack. I personally prefer doing it this way if we can agree on a common place for any scripts that we want to use. That saves us baking things into images and then trying to keep them in sync. Better imo to have them in edpm-ansible for now.

Comment on lines -36 to -47
ansible.builtin.include_role:
name: osp.edpm.edpm_container_standalone
vars:
edpm_container_standalone_service: ovn_controller
edpm_container_standalone_container_defs:
ovn_controller: "{{ lookup('template', 'ovn_controller.yaml.j2') | from_yaml }}"
edpm_container_standalone_kolla_config_files:
ovn_controller: "{{ lookup('template', 'kolla_ovn_controller.yaml.j2') | from_yaml }}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this needs to stay in order to support both deployment methodologies. It can just be conditional like:
https://github.com/openstack-k8s-operators/edpm-ansible/pull/830/files#diff-34e3323585e197e806d463771e3b5132716048c41818b1318fecb2c0d8e36cd6R45

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/3d020a0278384b7f97de3e8e26403819

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 53m 23s
podified-multinode-edpm-deployment-crc FAILURE in 1h 42m 47s
cifmw-crc-podified-edpm-baremetal FAILURE in 1h 38m 46s
edpm-ansible-tempest-multinode FAILURE in 1h 48m 18s
✔️ edpm-ansible-molecule-edpm_bootstrap SUCCESS in 7m 03s
✔️ edpm-ansible-molecule-edpm_podman SUCCESS in 6m 11s
✔️ edpm-ansible-molecule-edpm_module_load SUCCESS in 4m 40s
✔️ edpm-ansible-molecule-edpm_kernel SUCCESS in 9m 52s
✔️ edpm-ansible-molecule-edpm_libvirt SUCCESS in 7m 58s
✔️ edpm-ansible-molecule-edpm_nova SUCCESS in 8m 37s
edpm-ansible-molecule-edpm_frr FAILURE in 6m 41s
edpm-ansible-molecule-edpm_iscsid FAILURE in 4m 14s
edpm-ansible-molecule-edpm_ovn_bgp_agent FAILURE in 6m 35s
✔️ edpm-ansible-molecule-edpm_ovs SUCCESS in 12m 18s
✔️ edpm-ansible-molecule-edpm_tripleo_cleanup SUCCESS in 4m 09s
✔️ edpm-ansible-molecule-edpm_tuned SUCCESS in 6m 04s
✔️ edpm-ansible-molecule-edpm_telemetry_power_monitoring SUCCESS in 8m 04s
✔️ edpm-ansible-molecule-edpm_update SUCCESS in 6m 04s
adoption-standalone-to-crc-ceph-provider FAILURE in 2h 41m 16s

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/87ae0cbb50854f54a18570dc772271c1

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 50m 57s
podified-multinode-edpm-deployment-crc FAILURE in 1h 42m 33s
cifmw-crc-podified-edpm-baremetal FAILURE in 1h 46m 20s
edpm-ansible-tempest-multinode POST_FAILURE in 1h 42m 59s
✔️ edpm-ansible-molecule-edpm_bootstrap SUCCESS in 6m 01s
✔️ edpm-ansible-molecule-edpm_podman SUCCESS in 6m 20s
✔️ edpm-ansible-molecule-edpm_module_load SUCCESS in 4m 44s
✔️ edpm-ansible-molecule-edpm_kernel SUCCESS in 10m 16s
✔️ edpm-ansible-molecule-edpm_libvirt SUCCESS in 10m 10s
✔️ edpm-ansible-molecule-edpm_nova SUCCESS in 11m 10s
edpm-ansible-molecule-edpm_frr FAILURE in 6m 56s
edpm-ansible-molecule-edpm_iscsid FAILURE in 4m 32s
edpm-ansible-molecule-edpm_ovn_bgp_agent FAILURE in 6m 57s
✔️ edpm-ansible-molecule-edpm_ovs SUCCESS in 12m 38s
✔️ edpm-ansible-molecule-edpm_tripleo_cleanup SUCCESS in 4m 21s
✔️ edpm-ansible-molecule-edpm_tuned SUCCESS in 6m 10s
✔️ edpm-ansible-molecule-edpm_telemetry_power_monitoring SUCCESS in 8m 11s
✔️ edpm-ansible-molecule-edpm_update SUCCESS in 6m 35s
adoption-standalone-to-crc-ceph-provider FAILURE in 2h 38m 42s

@bshephar bshephar force-pushed the support-bootc branch 12 times, most recently from ffd86e6 to 54f7101 Compare December 2, 2024 01:52
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/deff54e047964013a0c7461f18cfe415

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 58m 14s
podified-multinode-edpm-deployment-crc FAILURE in 1h 42m 46s
cifmw-crc-podified-edpm-baremetal FAILURE in 1h 37m 26s
edpm-ansible-tempest-multinode FAILURE in 1h 48m 44s
✔️ edpm-ansible-molecule-edpm_bootstrap SUCCESS in 5m 44s
✔️ edpm-ansible-molecule-edpm_podman SUCCESS in 6m 07s
✔️ edpm-ansible-molecule-edpm_module_load SUCCESS in 4m 35s
✔️ edpm-ansible-molecule-edpm_kernel SUCCESS in 9m 55s
✔️ edpm-ansible-molecule-edpm_libvirt SUCCESS in 9m 21s
✔️ edpm-ansible-molecule-edpm_nova SUCCESS in 10m 26s
edpm-ansible-molecule-edpm_frr FAILURE in 6m 40s
edpm-ansible-molecule-edpm_iscsid FAILURE in 4m 16s
edpm-ansible-molecule-edpm_ovn_bgp_agent FAILURE in 6m 24s
✔️ edpm-ansible-molecule-edpm_ovs SUCCESS in 12m 31s
✔️ edpm-ansible-molecule-edpm_tripleo_cleanup SUCCESS in 4m 12s
✔️ edpm-ansible-molecule-edpm_tuned SUCCESS in 5m 55s
✔️ edpm-ansible-molecule-edpm_telemetry_power_monitoring SUCCESS in 7m 34s
✔️ edpm-ansible-molecule-edpm_update SUCCESS in 6m 06s
adoption-standalone-to-crc-ceph-provider FAILURE in 2h 46m 37s

{{
ovn_controller_pod_spec | combine({
'spec': {
'containers': ovn_controller_pod_spec.spec.containers | zip_longest([], [{'image': edpm_ovn_controller_agent_image}]) | map('combine') | list,
Copy link
Contributor

@slagle slagle Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that if we were to customize the image like this then by definition the container is no longer "logically bound". Instead, it would be considered a "floating" container per https://containers.github.io/bootc/logically-bound-images.html#comparison-with-default-podman-systemd-units

and also:

There is no mechanism to inject arbitrary arguments to the podman pull (or equivalent) invocation used by bootc.

which seems to apply additional mounts or other options passed to podman pull are not possible.

The dynamically-injected ConfigMaps[1][2] may provide some customization, but that is still not likely for the app container image itself, b/c once that is changed to some other image, then that no longer fits into how logically bound images should be managed with the lifecycle of the base bootc image itself.

Point being...if we choose to allow the ability to podman run any arbitrary image at runtime, then these really aren't logically bound images at all, but are considered "floating".

The question becomes, should we adopt logically bound images, and require our end users to be building new bootc images before deploying EDPM nodes, depending on if they need to customize any of the container images? We could ship a bootc image that had all the images logically bound, but if a user wanted to run a different one (from a partner, etc) they they would need to rebuild that image.

I do like the quadlet/systemd design, and I think we can still adopt that either way.

[1] https://containers.github.io/bootc/building/guidance.html?highlight=configmap#configuration
[2] containers/bootc#22

@bshephar bshephar force-pushed the support-bootc branch 2 times, most recently from a0569c4 to b590229 Compare December 9, 2024 05:38
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/e201c6428c384df8a92084c2694ba93e

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 55m 48s
podified-multinode-edpm-deployment-crc FAILURE in 1h 41m 35s
cifmw-crc-podified-edpm-baremetal FAILURE in 1h 43m 16s
edpm-ansible-tempest-multinode FAILURE in 1h 48m 37s
✔️ edpm-ansible-molecule-edpm_bootstrap SUCCESS in 6m 54s
✔️ edpm-ansible-molecule-edpm_podman SUCCESS in 6m 14s
✔️ edpm-ansible-molecule-edpm_module_load SUCCESS in 4m 49s
✔️ edpm-ansible-molecule-edpm_kernel SUCCESS in 7m 01s
✔️ edpm-ansible-molecule-edpm_libvirt SUCCESS in 9m 38s
✔️ edpm-ansible-molecule-edpm_nova SUCCESS in 10m 45s
edpm-ansible-molecule-edpm_frr FAILURE in 7m 02s
edpm-ansible-molecule-edpm_iscsid FAILURE in 4m 35s
edpm-ansible-molecule-edpm_ovn_bgp_agent FAILURE in 6m 50s
✔️ edpm-ansible-molecule-edpm_ovs SUCCESS in 12m 16s
✔️ edpm-ansible-molecule-edpm_tripleo_cleanup SUCCESS in 4m 06s
✔️ edpm-ansible-molecule-edpm_tuned SUCCESS in 5m 59s
✔️ edpm-ansible-molecule-edpm_telemetry_power_monitoring SUCCESS in 7m 41s
✔️ edpm-ansible-molecule-edpm_update SUCCESS in 6m 14s
adoption-standalone-to-crc-ceph-provider FAILURE in 2h 39m 26s

@slagle
Copy link
Contributor

slagle commented Dec 10, 2024

I created a new bootc branch for edpm-ansible: https://github.com/openstack-k8s-operators/edpm-ansible/tree/bootc

Can you propose this PR to the bootc branch instead?

I reverted #813 from main in #844 I think that was the only other bootc related PR that has merged.

@bshephar bshephar force-pushed the support-bootc branch 4 times, most recently from 1afffaf to 51a8b34 Compare December 11, 2024 00:34
@bshephar bshephar changed the base branch from main to bootc December 11, 2024 01:34
Signed-off-by: Brendan Shephard <[email protected]>

dnf yum-utils

Signed-off-by: Brendan Shephard <[email protected]>

nvme-package

Signed-off-by: Brendan Shephard <[email protected]>
This change writes systemd files to etc instead of /usr/share
along with adding support for Python libraries baked into the bootc image.

Signed-off-by: Brendan Shephard <[email protected]>
This change moves the script we're using for the
logs cronjob into the /var/lib/openstack/cron directory. This facilitates
the bootc immutable filesystem where we can't write to /usr, while also
consolidating scripts relevant to our deployment in a common place.

Signed-off-by: Brendan Shephard <[email protected]>
Signed-off-by: Brendan Shephard <[email protected]>
Copy link
Contributor

openshift-ci bot commented Dec 11, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bshephar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Brendan Shephard <[email protected]>
Signed-off-by: Brendan Shephard <[email protected]>
Signed-off-by: Brendan Shephard <[email protected]>
Copy link
Contributor Author

@bshephar bshephar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire handling of kernel args needs to be reconsidered for bootc:
https://containers.github.io/bootc/building/kernel-arguments.html

Signed-off-by: Brendan Shephard <[email protected]>
@steveb
Copy link
Contributor

steveb commented Dec 13, 2024

The entire handling of kernel args needs to be reconsidered for bootc: https://containers.github.io/bootc/building/kernel-arguments.html

that kargs.d mechanism looks interesting. I reckon if we're doing anything for the demo, ansible should write to a /usr/lib/bootc/kargs.d/00-edpm.toml and reboot. Also we need to ensure the generated qcow2 has console= arguments removed

Signed-off-by: Brendan Shephard <[email protected]>
@bshephar bshephar force-pushed the support-bootc branch 2 times, most recently from 5387f86 to 56bf7e4 Compare December 16, 2024 03:24
Signed-off-by: Brendan Shephard <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants