Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prometheus_alert_rules_files no longer works as expected since 0.20.0 #454

Open
wookietreiber opened this issue Nov 6, 2024 · 5 comments

Comments

@wookietreiber
Copy link
Contributor

Just wanted to update the collection to latest, and I recognized this diff.

# config
prometheus_alert_rules_files: >-
  roles/prometheus-eve/files/rules/*.yml

This is output with older version 0.17.0:

TASK [prometheus.prometheus.prometheus : Copy custom alerting rule files] ***
changed: [head1] => (item=/home/.../roles/prometheus-eve/files/rules/memory.yml)
changed: [head1] => (item=/home/.../roles/prometheus-eve/files/rules/slurm.yml)
changed: [head1] => (item=/home/.../roles/prometheus-eve/files/rules/storage.yml)
changed: [head1] => (item=/home/.../roles/prometheus-eve/files/rules/time.yml)
changed: [head1] => (item=/home/.../roles/prometheus-eve/files/rules/spectrum-scale.yml)
changed: [head1] => (item=/home/.../roles/prometheus-eve/files/rules/network.yml)
changed: [head1] => (item=/home/.../roles/prometheus-eve/files/rules/cpu.yml)

This is with current latest 0.22.0:

TASK [prometheus.prometheus.prometheus : Copy custom alerting rule files] ***
changed: [head1] => (item=test.yml)
changed: [head1] => (item=filer.yml)
changed: [head1] => (item=ANSIBLE_RUNLOG.md)
changed: [head1] => (item=CONTRIBUTING.md)
changed: [head1] => (item=compute.yml)
changed: [head1] => (item=ansible.cfg)
changed: [head1] => (item=services.yml)
changed: [head1] => (item=infrastruktur.yml)
changed: [head1] => (item=README.md)
changed: [head1] => (item=local.yml)
changed: [head1] => (item=bmc.yml)
changed: [head1] => (item=requirements.yml)
changed: [head1] => (item=heads.yml)
changed: [head1] => (item=interactive.yml)
changed: [head1] => (item=site.yml)
changed: [head1] => (item=global.yml)
changed: [head1] => (item=vault-keyring-client.py)
changed: [head1] => (item=Makefile)
changed: [head1] => (item=hosts)

These are the files in my top-level ansible directory from where I run my top-level playbooks:

$ ls
ad-hoc             bmc.yml          filer.yml   heads.yml  infrastruktur.yml  Makefile          roles         test.yml
ansible.cfg        compute.yml      global.yml  host_vars  interactive.yml    README.md         services.yml  vault-keyring-client.py
ANSIBLE_RUNLOG.md  CONTRIBUTING.md  group_vars  hosts      local.yml          requirements.yml  site.yml

$ ansible-playbook infrastruktur.yml -l head1 -t prometheus-main -C

So there seems to be something wrong with the globbing happening with prometheus_alert_rules_files and the Copy custom alerting rule files task. The diff from 0.17.0 to 0.22.0 (git diff 0.17.0..0.22.0 roles/prometheus/tasks/configure.yml):

 - name: Copy custom alerting rule files
   ansible.builtin.copy:
     src: "{{ item }}"
     dest: "{{ prometheus_config_dir }}/rules/"
-    owner: root
+    owner: "{{ prometheus_system_user }}"
     group: "{{ prometheus_system_group }}"
     mode: 0640
     validate: "{{ _prometheus_binary_install_dir }}/promtool check rules %s"
-  with_fileglob: "{{ prometheus_alert_rules_files }}"
+  loop: "{{ prometheus_alert_rules_files | map('ansible.builtin.fileglob') | flatten }}"
   when:
     - not prometheus_agent_mode
   notify:
     - reload prometheus
+  become: true
+  tags:
+    - prometheus
+    - configure
+    - prometheus_configure

This was done in 1e4e4c3 before the 0.20.0 release.

@gardar
Copy link
Member

gardar commented Nov 6, 2024

The prometheus_alert_rules_files variable should be a list.

# config
prometheus_alert_rules_files:
  - roles/prometheus-eve/files/rules/*.yml

Not sure why the argument spec validation isn't catching this, it was also reported in #444

When you run your playbook, do you see a task named Validating arguments against arg spec 'main' being executed?

@wookietreiber
Copy link
Contributor Author

Thanks, changing to:

prometheus_alert_rules_files:
  - roles/prometheus-eve/files/rules/*.yml

... fixed the issue:

TASK [prometheus.prometheus.prometheus : Copy custom alerting rule files] ***
changed: [head1] => (item=roles/prometheus-eve/files/rules/memory.yml)
changed: [head1] => (item=roles/prometheus-eve/files/rules/slurm.yml)
changed: [head1] => (item=roles/prometheus-eve/files/rules/storage.yml)
changed: [head1] => (item=roles/prometheus-eve/files/rules/time.yml)
changed: [head1] => (item=roles/prometheus-eve/files/rules/spectrum-scale.yml)
changed: [head1] => (item=roles/prometheus-eve/files/rules/network.yml)
changed: [head1] => (item=roles/prometheus-eve/files/rules/cpu.yml)

With my previous config, my mistake wasn't caught:

TASK [prometheus.prometheus.prometheus : Validating arguments against arg spec 'main' - Installs and configures prometheus] ***
ok: [head1]

@wookietreiber
Copy link
Contributor Author

I figure, the check/validation might be incomplete, because it's skipping:

TASK [prometheus.prometheus.prometheus : Alert when alert rules files are found in the old .rules format] ***
skipping: [head1]
- name: Alert when alert rules files are found in the old .rules format
  ansible.builtin.debug:
    msg: >
      prometheus_alert_rules_files contains rules in the old .rules file format.
      This format has been deprecated in favour of the new YAML format since Prometheus v2.x.
      You can update your rules using promtool: promtool update rules <filenames>

      If your rules are already in the YAML format but with the .rules extension, you can safely ignore this message,
      or rename the files to <filename>.yaml or <filename>.yml to get rid of it.
  when: prometheus_alert_rules_files | select('search', '.rules$') | list | length > 0

It looks only for matches ending in .rules, not if the variable has a list format. I suggest adding this check:

- name: Validate that prometheus_alert_rules_files is a list
  ansible.builtin.assert:
    that:
      - prometheus_alert_rules_files is iterable
      - prometheus_alert_rules_files is not string
    fail_msg: '`prometheus_alert_rules_files` must be a list'
    success_msg: '`prometheus_alert_rules_files` is a list'
  when:
    - prometheus_alert_rules_files is defined

Can do PR if you want.

@gardar
Copy link
Member

gardar commented Nov 6, 2024

That assert task was introduced in #333 and is to help users migrating from .rules to .yml and is unrelated to this.

While we could definitely add a task to validate each variable I don't think it's the right thing to do as it is the purpose of the Role argument validation which we have already in place roles/prometheus/meta/argument_specs.yml, we just need to investigate why it's not working as expected.

@gardar
Copy link
Member

gardar commented Nov 6, 2024

Looks like this is by design as apparently the argument validation is currently just for ensuring the variable can be converted into the type specified in the argument spec and is not for enforcing the type of the variable.
But there is a open PR over at ansible/ansible#81575 which would enforce the variable type as specified in the argument spec.

In the meantime we could probably just convert the prometheus_alert_rules_files variable to a list at the task level with something like:
loop: "{{ [prometheus_alert_rules_files] | map('ansible.builtin.fileglob') | flatten }}"
or
loop: "{{ prometheus_alert_rules_files | list | map('ansible.builtin.fileglob') | flatten }}"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants