Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: export corosync configuration #231

Merged
merged 8 commits into from
Dec 5, 2024

Conversation

tomjelinek
Copy link
Member

Enhancement:
Provide ha_cluster_info module to export current cluster configuration. This PR implements a first stage, exporting corosync configuration. Other parts of configuration will follow in other PRs.

Reason:
This is the first step in implementing an info module which exports cluster configuration in a variables structure in the same format as ha_cluster role accepts.

Result:
ha_cluster_info module exports corosync configuration, which can be used to recreate the same corosync cluster when passed to the role

Issue Tracker Tickets (Jira or BZ if any):
https://issues.redhat.com/browse/RHEL-46219

@tomjelinek tomjelinek requested a review from richm as a code owner October 8, 2024 11:12
@tomjelinek tomjelinek changed the title Export corosync feat: export corosync configuration Oct 8, 2024
@tomjelinek
Copy link
Member Author

[citest]

@tomjelinek
Copy link
Member Author

tomjelinek commented Oct 8, 2024

I updated the pcs_version vs ubuntu version matrix, as pcs main no longer builds on ubuntu-22.04. And then Python Unit Tests / python (ubuntu-24.04, main) fails when trying to upgrade pip:

+ python -m pip install --upgrade pip
error: externally-managed-environment

× This environment is externally managed
╰─> To install Python packages system-wide, try apt install
    python3-xyz, where xyz is the package you are trying to
    install.
    
    If you wish to install a non-Debian-packaged Python package,
    create a virtual environment using python3 -m venv path/to/venv.
    Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make
    sure you have python3-full installed.
    
    If you wish to install a non-Debian packaged Python application,
    it may be easiest to use pipx install xyz, which will manage a
    virtual environment for you. Make sure you have pipx installed.
    
    See /usr/share/doc/python3.12/README.venv for more information.

note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages.
hint: See PEP 668 for the detailed specification.

I suppose upgrading pip could be removed. But I'm afraid that would solve nothing, as the next command is pip install "git+https://github.com/linux-system-roles/[email protected]" and that would probably fail with the same message. So what's the process of installing tox-lsr on ubuntu-24.04? Should I just add --break-system-packages as the message suggests?

@tomjelinek
Copy link
Member Author

ansible_test fails with this error:

Running sanity test "ansible-doc"
Run command: ansible-doc -t module fedora.linux_system_roles.ha_cluster_info fedora.linux_system_roles.pcs_api_v2 fedora.linux_system_roles.pcs_qdevice_certs
ERROR: Output on stderr from ansible-doc is considered an error.

Command "ansible-doc -t module fedora.linux_system_roles.ha_cluster_info fedora.linux_system_roles.pcs_api_v2 fedora.linux_system_roles.pcs_qdevice_certs" returned exit status 0.
>>> Standard Error
Warning: : Collection fedora.linux_system_roles does not support Ansible
version 2.14.17.post0

Any idea what this means and how to fix it?

@tomjelinek
Copy link
Member Author

CentOS-Stream-8|ansible-2.9 fails with Could not detect a supported package manager from the following list: ['pkg', 'apt', 'rpm', 'portage'], or the required Python library is not installed. Check warnings for details.. I think we went over this already, and the resolution was that this was an incompatibility between CentOS 8 and Ansible.

I'm not sure why the other CentOS and Fedora tests are marked as failures, when all their logs are success.

@spetrosi
Copy link
Collaborator

[citest]

@spetrosi
Copy link
Collaborator

CentOS-Stream-8|ansible-2.9 fails with Could not detect a supported package manager from the following list: ['pkg', 'apt', 'rpm', 'portage'], or the required Python library is not installed. Check warnings for details.. I think we went over this already, and the resolution was that this was an incompatibility between CentOS 8 and Ansible.

Looking into this, I think it used to work, idk what broke it.

I'm not sure why the other CentOS and Fedora tests are marked as failures, when all their logs are success.

Fixed in linux-system-roles/tft-tests#53, tests passed but some tasks run in background after the testing phase finished, it caused the failure of test plan. Now it's passing.

@spetrosi
Copy link
Collaborator

Fixing issue with ansible-2.9 on CS8 in linux-system-roles/tft-tests#54

@spetrosi
Copy link
Collaborator

[citest]

README.md Outdated
```yaml
- name: Get current cluster configuration
linux-system-roles.ha_cluster.ha_cluster_info:
register: ha_cluster_info_result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

System roles, by convention, do not support users using modules directly. In every other role that does something like this, users use the role with either no arguments like https://github.com/linux-system-roles/firewall?tab=readme-ov-file#gathering-firewall-ansible-facts:

- name: Get current cluster configuration
  include_role:
    name: linux-system-roles.ha_cluster

or with some special variable

- name: Get current cluster configuration
  include_role:
    name: linux-system-roles.ha_cluster
  vars:
    ha_cluster_get_info: true

I think the ha_cluster role will have to do something like the latter, since there are numerous public api variables, as opposed to the firewall role which just has the one main firewall variable. The latter also makes it possible for the role to

  • set the state of the cluster and return the cluster configuration
  • return a subset of the information

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The role would then set a global variable e.g. ha_cluster_info that users would use. This return variable will be declared in the README.md in the section Variables Exported by the Role e.g. https://github.com/linux-system-roles/kernel_settings?tab=readme-ov-file#variables-exported-by-the-role

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bootloader and snapshot roles export info with <rolename>_facts variable. Let's be consistent with this naming.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I've been busy with other projects.

I wasn't sure what would be the correct way to expose the export functionality. So I'm glad you pointed me in the right direction. I'm going to implement this change, hopefully in a couple of weeks, once I finish tasks that require my immediate attention.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed new commits which implement this requested change.

Copy link

codecov bot commented Nov 28, 2024

Codecov Report

Attention: Patch coverage is 89.44444% with 19 lines in your changes missing coverage. Please review.

Project coverage is 78.94%. Comparing base (fd00915) to head (4be7b3c).
Report is 13 commits behind head on main.

Files with missing lines Patch % Lines
library/ha_cluster_info.py 64.00% 18 Missing ⚠️
module_utils/ha_cluster_lsr/info/loader.py 98.46% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #231       +/-   ##
===========================================
+ Coverage   68.50%   78.94%   +10.43%     
===========================================
  Files           3        6        +3     
  Lines         181      361      +180     
===========================================
+ Hits          124      285      +161     
- Misses         57       76       +19     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tomjelinek
Copy link
Member Author

[citest]

@tomjelinek tomjelinek requested review from spetrosi and richm November 28, 2024 10:27
Copy link
Contributor

@richm richm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - @spetrosi ptal

README.md Outdated
@@ -59,6 +64,13 @@ ansible-galaxy collection install -r meta/collection-requirements.yml

### Defined in `defaults/main.yml`

#### `ha_cluster_get_info`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought initially that it makes sense to rename this to ha_cluster_facts for consistency with other roles. Although this functionality is a little more than just getting the current state, it exports the configuration in a format that can be used to run the role.
Maybe rename this to ha_cluster_export_configuration, I think it's more in the face naming, if we export the configuration it implies that we can import it with the same role, which is the case here.
@richm WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - sounds good - please rename to ha_cluster_export_configuration and then I think we're good to go.

@tomjelinek
Copy link
Member Author

[citest]

@tomjelinek
Copy link
Member Author

CentOS-10 tests_cluster_basic_cloud_packages will be fixed in another pull request. CentOS-9 tests_qdevice_tls_kaptb_options is a bit flaky test. Neither of these is related to changes done in this PR.

@richm
Copy link
Contributor

richm commented Dec 4, 2024

CentOS-10 tests_cluster_basic_cloud_packages will be fixed in another pull request. CentOS-9 tests_qdevice_tls_kaptb_options is a bit flaky test. Neither of these is related to changes done in this PR.

ok - @spetrosi please review - then we can merge

Looks like the python unit tests are broken on ubuntu-24.04 - I looked at this briefly - looks like the pip install --prefix argument in that version of pip adds /local to the end - the Makefile then cannot find the binary to install

@tomjelinek
Copy link
Member Author

Looks like the python unit tests are broken on ubuntu-24.04 - I looked at this briefly - looks like the pip install --prefix argument in that version of pip adds /local to the end - the Makefile then cannot find the binary to install

Yes, this needs to be fixed on pcs side. It's in the TODO list, hopefully we get to resolve it soon.

Copy link
Collaborator

@spetrosi spetrosi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@tomjelinek tomjelinek merged commit 7804be0 into linux-system-roles:main Dec 5, 2024
23 of 26 checks passed
@tomjelinek tomjelinek deleted the export-corosync branch December 5, 2024 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants