Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2 processes listening on livestatus socket #117

Open
pvdputte opened this issue Jan 16, 2024 · 1 comment
Open

2 processes listening on livestatus socket #117

pvdputte opened this issue Jan 16, 2024 · 1 comment

Comments

@pvdputte
Copy link

I've previously commented on this livestatus issue but probably should have opened a new one here instead. Sorry.

Basically, the problem I see is that even in a fresh install without any custom configuration except for the TCP livestatus socket, after a systemctl reload naemon, there are two processes listening:

vagrant@bookworm:~$ sudo netstat -tupan | grep -e Recv -e naemon
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:6557          0.0.0.0:*               LISTEN      4067/naemon         
tcp        3      0 0.0.0.0:6557          0.0.0.0:*               LISTEN      4072/naemon   

One of them is not responding (waiting to be reaped? although top is not saying it's a zombie)
As a result, Thruk sometimes behaves erratically, says the backend is down etc.

This is the config:

vagrant@bookworm:~$ cat /etc/naemon/module-conf.d/livestatus.cfg 
# Naemon config
broker_module=/usr/lib/naemon/naemon-livestatus/livestatus.so inet_addr=0.0.0.0:6557 debug=1
event_broker_options=-1

This can be easily reproduced in vagrant.

$ vagrant init debian/bookworm64
$ vagrant up
$ vagrant ssh

Next, copy these commands into a script and execute it.

vagrant@bookworm:~$ wget -O reproduce https://github.com/naemon/naemon-livestatus/files/13950328/reproduce.txt
vagrant@bookworm:~$ chmod +x reproduce
vagrant@bookworm:~$ ./reproduce

This should result in something like

<installation>


----------
Restarting

tcp        0      0 0.0.0.0:6557            0.0.0.0:*               LISTEN      5408/naemon         

naemon,5408 --daemon /etc/naemon/naemon.cfg
  ├─naemon,5409 --worker /var/lib/naemon/naemon.qh
  ├─naemon,5410 --worker /var/lib/naemon/naemon.qh
  ├─naemon,5411 --worker /var/lib/naemon/naemon.qh
  ├─naemon,5412 --worker /var/lib/naemon/naemon.qh
  └─naemon,5413 --daemon /etc/naemon/naemon.cfg

systemd,5367 --user
  └─(sd-pam),5368


----------------------
Reloading until broken

Success.
tcp        0      0 0.0.0.0:6557            0.0.0.0:*               LISTEN      5408/naemon         
tcp        0      0 0.0.0.0:6557            0.0.0.0:*               LISTEN      5413/naemon         

naemon,5408 --daemon /etc/naemon/naemon.cfg
  ├─naemon,5413 --daemon /etc/naemon/naemon.cfg
  ├─naemon,5434 --worker /var/lib/naemon/naemon.qh
  ├─naemon,5435 --worker /var/lib/naemon/naemon.qh
  ├─naemon,5436 --worker /var/lib/naemon/naemon.qh
  └─naemon,5437 --worker /var/lib/naemon/naemon.qh

systemd,5367 --user
  └─(sd-pam),5368

---------------------------------
Running "GET status" every second, response size 0 is not good:
2024-01-16T13:08:05+00:00 976
2024-01-16T13:08:06+00:00 976
2024-01-16T13:08:07+00:00 0
2024-01-16T13:08:10+00:00 977
2024-01-16T13:08:11+00:00 0
^C

Notice that process 5413 already exists when naemon is first started, but only after the reload, it also starts listening on that socket.

My current workaround is to restart instead of reload after each config change, but this takes a lot longer than reloading (rather large config). Or I should go back to xinetd.

@pvdputte
Copy link
Author

FYI, I went back to xinetd & unixcat /var/cache/naemon/live. systemctl reload works fine again as xinetd is now handling the TCP socket. So I have my 17 second reload back, instead of a 30 second restart. 🙂

Perhaps someday I could look into systemd socket activation for this.

theseal added a commit to SUNET/puppet-sunet that referenced this issue Oct 18, 2024
naemon/naemon-livestatus#117

As close to a reload we can get. At least the webgui and SSO session is
preserved in this way.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant