Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Puppet attempts to restart running varnish service #145

Open
kwisatz opened this issue Aug 8, 2017 · 9 comments
Open

Puppet attempts to restart running varnish service #145

kwisatz opened this issue Aug 8, 2017 · 9 comments

Comments

@kwisatz
Copy link
Contributor

kwisatz commented Aug 8, 2017

OS: Debian 8.9
Puppet: 3.8.7

After an upgrade of various puppet modules, puppet is now trying to restart the varnish service although it is already running. This only happens for the varnish service, so I'm pretty sure it's related to this module, not puppet in general:

Aug 08 16:49:20 loadbalancer puppet-agent[16858]: Could not start Service[varnish]: Execution of '/etc/init.d/varnish start' returned 1: Starting HTTP accelerator: varnishd failed!
Aug 08 16:49:20 loadbalancer puppet-agent[16858]: bind(): Address already in use
Aug 08 16:49:20 loadbalancer puppet-agent[16858]: bind(): Address already in use
Aug 08 16:49:20 loadbalancer puppet-agent[16858]: Error: Failed to open (any) accept sockets.
Aug 08 16:49:20 loadbalancer puppet-agent[16858]: Wrapped exception:
Aug 08 16:49:20 loadbalancer puppet-agent[16858]: Execution of '/etc/init.d/varnish start' returned 1: Starting HTTP accelerator: varnishd failed!
Aug 08 16:49:20 loadbalancer puppet-agent[16858]: bind(): Address already in use
Aug 08 16:49:20 loadbalancer puppet-agent[16858]: bind(): Address already in use
Aug 08 16:49:20 loadbalancer puppet-agent[16858]: Error: Failed to open (any) accept sockets.
Aug 08 16:49:20 loadbalancer puppet-agent[16858]: (/Stage[main]/Varnish::Service/Service[varnish]/ensure) change from stopped to running failed: Could not start Service[varnish]: Execution of '/etc/init.d/varnish start' returned 1: Starting HTTP accelerator: varnishd failed!
Aug 08 16:49:20 loadbalancer puppet-agent[16858]: (/Stage[main]/Varnish::Service/Service[varnish]/ensure) bind(): Address already in use
Aug 08 16:49:20 loadbalancer puppet-agent[16858]: (/Stage[main]/Varnish::Service/Service[varnish]/ensure) bind(): Address already in use
Aug 08 16:49:20 loadbalancer puppet-agent[16858]: (/Stage[main]/Varnish::Service/Service[varnish]/ensure) Error: Failed to open (any) accept sockets.

It incorrectly detects varnish as not running. Looking at the service output, we can see that it is running:

root@loadbalancer:~# service varnish status
● varnish.service - Varnish Cache, a high-performance HTTP accelerator
   Loaded: loaded (/etc/systemd/system/varnish.service; enabled)
   Active: active (running) since Tue 2017-08-08 13:08:16 CEST; 4h 9min ago
 Main PID: 30600 (varnishd)
   CGroup: /system.slice/varnish.service
           ├─30600 /usr/sbin/varnishd -a :6081 -T localhost:6082 -f /etc/varnish/default.vcl -S /etc/varnish/secret -s malloc,256m -p ht...
           └─30601 /usr/sbin/varnishd -a :6081 -T localhost:6082 -f /etc/varnish/default.vcl -S /etc/varnish/secret -s malloc,256m -p ht...

The most confusing part about this is that the error only occurs when puppet is run by the service, not when I manually run the agent.

Edit: I modified the DAEMON_OPTS to make puppet log output more verbose, and I discovered that it is indeed calling the status action on the init.d script:

Aug 08 17:48:12 loadbalancer puppet-agent[28735]: Executing '/etc/init.d/nginx status'
Aug 08 17:48:12 loadbalancer puppet-agent[28735]: Executing '/etc/init.d/varnish status'
Aug 08 17:48:12 loadbalancer puppet-agent[28735]: Executing '/etc/init.d/varnish start'
Aug 08 17:48:12 loadbalancer puppet-agent[28735]: Could not start Service[varnish]: Execution of '/etc/init.d/varnish start' returned 1: Starting HTTP accelerator: varnishd failed!

So it must be misinterpreting the result it sees from the status call…

I also made sure the status call actually exits with a code of 0, which it does and which is, according to the documentation of the service resource, what puppet is looking for.

@cyberscribe
Copy link

Have you checked the size of /var/lib/varnish? The module default of 128M was (legitimately) crashing varnish for me repeatedly. If yours is fine and running, then this is indeed just a signalling issue.

@kwisatz
Copy link
Contributor Author

kwisatz commented Aug 10, 2017

@cyberscribe I've checked, because we've had the issue of if filling up and preventing varnish from restarting in the past.
But this is a completely different issue. Varnish is running just fine, it's just that somehow - and I just can't get to the bottom of this – the puppet service resource is unable to determine so from its checks.

It also seems that the status check (or rather, the ensure => running check) is only made when puppet runs from the periodic checks, not when run manually.

This happend on all machines which use varnish, so it's not just a glitch somewhere. And the varnish puppet module hasn't changed, so I really am dumbfounded.

But I posted this here so maybe anyone else with the same problem would find it and be able to share their findings.

@cyberscribe
Copy link

cyberscribe commented Aug 10, 2017 via email

@kwisatz
Copy link
Contributor Author

kwisatz commented Aug 12, 2017

OK, I just figured out that it must be a problem with or bug in the puppet package distributed by puppetlabs. If I install 3.7.2 from the debian repos, it works. My guess is that the version in the debian repos has a different method of checking service status… but I haven't dug deeper yet.

root@loadbalancer:~# apt-get install puppet=3.7.2-4+deb8u1 puppet-common=3.7.2-4+deb8u1
[…]
The following NEW packages will be installed:
  ruby-safe-yaml
The following packages will be upgraded:
  puppet puppet-common
2 upgraded, 1 newly installed, 0 to remove and 47 not upgraded.
Need to get 0 B/1,053 kB of archives.
After this operation, 2,318 kB disk space will be freed.
Do you want to continue? [Y/n] 
[…]
Unpacking puppet (3.7.2-4+deb8u1) over (3.7.2-1puppetlabs1) ...
Preparing to unpack .../puppet-common_3.7.2-4+deb8u1_all.deb ...
Unpacking puppet-common (3.7.2-4+deb8u1) over (3.7.2-1puppetlabs1) ...

@func0der
Copy link

func0der commented Sep 4, 2017

@kwisatz So this is not a bug of this module?

@kwisatz
Copy link
Contributor Author

kwisatz commented Sep 4, 2017

@func0der well, it only occurs on this module for me, but that might be related to the varnish service definition for systemd on debian, rather than this module's code.

I'm sorry I didn't have the time to dig deeper into this, e.g. checking what the differences are between the puppetlabs and debian versions of 3.7.2 release of puppet.

@EmersonPrado
Copy link

My 2 cent: varnish::service doesn't know you're using systemd. The module uses fact service_provider, from stdlib, which doesn't return 'systemd' for Debian 8, then populates variable varnish::params::systemd with true or false according to the fact.

Out of curiosity, I grabbed one of my Puppet test setups - the one with more distinct VMs - and tested the fact and variable above. They are indeed incorrect for Debian 8:

==> debian-82: Notice: Scope(Class[main]): Service provider: debian
==> debian-82: Notice: Scope(Class[main]): systemd: false
==> debian-75: Notice: Scope(Class[main]): Service provider: debian
==> debian-75: Notice: Scope(Class[main]): systemd: false
==> debian-60: Notice: Scope(Class[main]): Service provider: debian
==> debian-60: Notice: Scope(Class[main]): systemd: false
==> redhat-70: Notice: Scope(Class[main]): Service provider: systemd
==> redhat-70: Notice: Scope(Class[main]): systemd: true
==> redhat-65: Notice: Scope(Class[main]): Service provider: redhat
==> redhat-65: Notice: Scope(Class[main]): systemd: false
==> redhat-56: Notice: Scope(Class[main]): Service provider: redhat
==> redhat-56: Notice: Scope(Class[main]): systemd: false
==> centos-59: Notice: Scope(Class[main]): Service provider: redhat
==> centos-59: Notice: Scope(Class[main]): systemd: false

BTW, I am sure 'debian-82' VM uses systemd:

$ vagrant ssh -c 'ps -p 1' debian-82
  PID TTY          TIME CMD
    1 ?        00:00:00 systemd

So the issue is indeed in stdlib. We could workaround by creating a custom fact in varnish module, but I prefer fixing stuff where the problem is.

@EmersonPrado
Copy link

In fact, I'm not quite sure stdlib's service_provider custom fact should return the init system. The description just says it "Returns the default provider Puppet uses to manage services on this system". The code, mentioned in the previous message, doesn't show any intention related to knowing the init system. So the assertion about fixing stdlib might just be plain wrong.

I tried to find a module in Forge which detects the init system. As strange as it is to me, I found none. All modules that claim to do so either guess it from the distro and OS major version, or detected the installed init systems and guess the used one. So, the only option left is writing a custom fact (or a separate module for that).

@EmersonPrado
Copy link

Added PR 159 implementing the init system check with a more reliable custom fact.

maxchk added a commit that referenced this issue Sep 28, 2017
[Issue #145] Add custom fact to check init system
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants