Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for zabbix as alternative to nagios #81

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

smoors
Copy link

@smoors smoors commented May 3, 2022

No description provided.

@stdweird
Copy link
Member

stdweird commented May 3, 2022

@smoors can you add a unittest as well?

also tests fail

Try to import each namespace

----------------------------------------------------------------------

Traceback (most recent call last):

  File "/var/lib/jenkins/jobs/hpcugent_github.com/jobs/vsc-utils/branches/PR-81/workspace/.tox/py36/lib/python3.6/site-packages/vsc_install-0.17.23-py3.6.egg/vsc/install/commontest.py", line 291, in test_import_packages

    msg='check_header of %s' % fn)

AssertionError: True is not false : check_header of lib/vsc/__init__.py


======================================================================


FAIL: test_prospector (vsc.install.commontest.CommonTest)

Test prospector failures

----------------------------------------------------------------------

Traceback (most recent call last):

  File "/var/lib/jenkins/jobs/hpcugent_github.com/jobs/vsc-utils/branches/PR-81/workspace/.tox/py36/lib/python3.6/site-packages/vsc_install-0.17.23-py3.6.egg/vsc/install/commontest.py", line 314, in test_prospector

    self.assertFalse(failures, "prospector failures: %s" % pprint.pformat(failures))

AssertionError: [{'source': 'pylint', 'code': 'unused-import', 'location': {'path': '/var/lib/jenkins/jobs/hpcugent_github.com/jobs/vsc-utils/branches/PR-81/workspace/lib/vsc/utils/script_tools.py', 'module': 'vsc.utils.script_tools', 'function': None, 'line': 44, 'character': 0}, 'message': 'Unused SimpleZabbix imported from vsc.utils.nagios'}] is not false : prospector failures: [{'code': 'unused-import',

  'location': {'character': 0,

               'function': None,

               'line': 44,

               'module': 'vsc.utils.script_tools',

               'path': '/var/lib/jenkins/jobs/hpcugent_github.com/jobs/vsc-utils/branches/PR-81/workspace/lib/vsc/utils/script_tools.py'},

  'message': 'Unused SimpleZabbix imported from vsc.utils.nagios',

  'source': 'pylint'}]

ERROR: script_tools (unittest.loader._FailedTest)

----------------------------------------------------------------------

ImportError: Failed to import test module: script_tools

Traceback (most recent call last):

  File "/usr/lib64/python3.6/unittest/loader.py", line 153, in loadTestsFromName

    module = __import__(module_name)

  File "/var/lib/jenkins/jobs/hpcugent_github.com/jobs/vsc-utils/branches/PR-81/workspace/test/script_tools.py", line 40, in <module>

    from vsc.utils.script_tools import ExtendedSimpleOption, DEFAULT_OPTIONS, NrpeCLI, CLI

  File "lib/vsc/utils/script_tools.py", line 44, in <module>

    from vsc.utils.nagios import (

ImportError: cannot import name 'SimpleZabbix'



======================================================================

FAIL: test_import_modules (vsc.install.commontest.CommonTest)

Try to import each module

----------------------------------------------------------------------

Traceback (most recent call last):

  File "/var/lib/jenkins/jobs/hpcugent_github.com/jobs/vsc-utils/branches/PR-81/workspace/.tox/py36/lib/python3.6/site-packages/vsc_install-0.17.23-py3.6.egg/vsc/install/commontest.py", line 274, in _import

    __import__(pkg)

  File "lib/vsc/utils/script_tools.py", line 44, in <module>

    from vsc.utils.nagios import (

ImportError: cannot import name 'SimpleZabbix'


During handling of the above exception, another exception occurred:



@smoors smoors changed the title add support for zabbix as alternative to nagios add support for zabbix as alternative to nagios (WIP) May 3, 2022
@smoors
Copy link
Author

smoors commented May 11, 2022

@smoors can you add a unittest as well?

done in 86afb52

@smoors smoors changed the title add support for zabbix as alternative to nagios (WIP) add support for zabbix as alternative to nagios May 11, 2022
@smoors
Copy link
Author

smoors commented May 11, 2022

@stdweird I think this is ready, can you have another look?

@@ -434,6 +438,11 @@ class SimpleNagios(NagiosResult):

def __init__(self, **kwargs):
"""Initialise message and perfdata"""
self._init(**kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't do this please

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, removed

@@ -434,6 +438,11 @@ class SimpleNagios(NagiosResult):

def __init__(self, **kwargs):
"""Initialise message and perfdata"""
self._init(**kwargs)

def _init(self, reporterclass=NagiosReporter, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use a class constant instead of passing the extra arg?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's indeed a simpler solution
fixed in 507804c

@@ -129,6 +131,7 @@ def __init__(self, options, run_prologue=True, excepthook=None, **kwargs):

self.nagios_reporter = None
self.lockfile = None
self.monitorclass = monitorclass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as with the reporter, why not use a class constant?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 507804c



def print_report_and_exit(self, timestamp, nagios_exit_code, nagios_exit_string, nagios_message):
"""Print the nagios report (if the data is not too old) and exit"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you not need/use the if/else logic? i would have expected that you only want to override the if block?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for zabbix the if/else logic is skipped: we always report, even if the data is too old, and let zabbix handle that.

return json.dumps(processed_dict)


class ExtendedSimpleOptionZabbix(ExtendedSimpleOption):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you use this in code? i see no easy way to pass this via commandline or config file

Copy link
Author

@smoors smoors May 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use this class directly like this:

opts = ExtendedSimpleOptionZabbix(options)

this similar to how it's used here: https://github.com/vub-hpc/Lmod-UGent/blob/master/run_lmod_cache.py

or did I miss something?

@smoors
Copy link
Author

smoors commented Jun 20, 2022

@stdweird to clarify, we have a script called run_slurm_cache.py that uses ExtendedSimpleOptionZabbix, and that we can run with like this:

run_slurm_cache.py --nagios-user=zabbix
run_slurm_cache.py --nagios-report

this is already working for us.
I just added a commit that sets 'zabbix' as cache user by default, so --nagios-user=zabbix is no longer needed.

can this be merged, or is there anything else that needs to be done?


def print_report_and_exit(self, timestamp, nagios_exit_code, nagios_exit_string, nagios_message):
"""Print the zabbix report and exit"""
print('[%f, "%s", %s]' % (timestamp, nagios_exit_string, nagios_message))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let the json module do this? So we don't have to care about any special case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's more complicated, because nagios_message is already a json string, so then you have to first load it and then dump it again.


# -*- encoding: utf-8 -*-
#
# Copyright 2012-2022 Ghent University
Copy link
Contributor

@wpoely86 wpoely86 Jun 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Copyright 2012-2022 Ghent University
# Copyright 2022-2022 Vrije Universiteit Brussel

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And run find ./{lib,test} -type f -name '*.py' | REPO_BASE_DIR=$PWD xargs -I '{}' python -m vsc.install.headers '{}' in the repo.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this command reverts the change back to Ghent University

"""Print the zabbix report and exit"""
print('[%f, "%s", %s]' % (timestamp, nagios_exit_string, nagios_message))
self.log.info("Zabbix check cache file %s contents delivered: %s", self.filename, nagios_message)
sys.exit(nagios_exit_code)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for zabbix, we want a normal exit I think? Will the agent not fail if it's non-zero?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from Ward:

Checking the exit code is removed from external checks, user parameters and system.run items. These checks will not to become unsupported if the exit code is different than zero. Exit code checks are now only performed in custom alert scripts, remote commands and user scripts executed on Zabbix server and proxy.

in de release notes van 3.4.3

dus this ok

@smoors smoors mentioned this pull request Sep 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants