Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSLH dropping connections, "too many open files" #456

Open
dm9bbadd4 opened this issue Jul 15, 2024 · 14 comments
Open

SSLH dropping connections, "too many open files" #456

dm9bbadd4 opened this issue Jul 15, 2024 · 14 comments

Comments

@dm9bbadd4
Copy link

This is a new issue I've got recently after updating to Ubuntu 24. After running SSLH for just a few days, the open files will quickly reach the limit of 1025. I'm just running a simple home server and it's obviously not receiving 1025 concurrent connections so I don't know why it has so many files open. I've tried raising this limit for SSLH to an arbitrary amount but it doesn't seem to matter.
Checking the soft limit for user sslh shows that it is way more than 1025
sudo su - sslh -s /bin/bash -c "ulimit -Sn"
Output: 100000
I've modified the /etc/security/limits.conf for user sslh to get to 100000

Since the last time this happened (yesterday), I've restarted the service so I can't strace it but next time it happens I will update this issue with the strace.

@yrutschle
Copy link
Owner

I don´t think changing the limit is the way to go: sslh uses 2 descriptors per connection, and as you say on a personal setup the limits should be high enough.
I would rather look at why are these connections open: maybe there is a firewall somewhere that prevents proper closure?

Check what is opened with lsof | grep sslh (for reference, my own personal setup has about 100 descriptors open)

@dm9bbadd4
Copy link
Author

dm9bbadd4 commented Jul 16, 2024

As in firewall running on the system? The only one I used is ufw. My setup was working fine for ages and then updating to Ubuntu LTS 24 just caused a whole world of issues with sslh in particular
I ran the command and there's only 33 open files right now

lsof | grep sslh
sslh      226566                              sslh  rtd       DIR               8,34          4096          2 /
sslh      226566                              sslh  txt       REG               8,34        549656    6423611 /usr/sbin/sslh
sslh      226566                              sslh  mem       REG               8,34        149760    6423674 /usr/lib/x86_64-linux-gnu/libgpg-error.so.0.34.0
sslh      226566                              sslh  mem       REG               8,34       1340976    6425542 /usr/lib/x86_64-linux-gnu/libgcrypt.so.20.4.3
sslh      226566                              sslh  mem       REG               8,34       2125328    6423108 /usr/lib/x86_64-linux-gnu/libc.so.6
sslh      226566                              sslh  mem       REG               8,34        755864    6428384 /usr/lib/x86_64-linux-gnu/libzstd.so.1.5.5
sslh      226566                              sslh  mem       REG               8,34        202904    6428531 /usr/lib/x86_64-linux-gnu/liblzma.so.5.4.5
sslh      226566                              sslh  mem       REG               8,34        137440    6424269 /usr/lib/x86_64-linux-gnu/liblz4.so.1.9.4
sslh      226566                              sslh  mem       REG               8,34         67584    6427452 /usr/lib/x86_64-linux-gnu/libev.so.4.0.0
sslh      226566                              sslh  mem       REG               8,34        910592    6423085 /usr/lib/x86_64-linux-gnu/libsystemd.so.0.38.0
sslh      226566                              sslh  mem       REG               8,34         51536    6428380 /usr/lib/x86_64-linux-gnu/libcap.so.2.66
sslh      226566                              sslh  mem       REG               8,34         51584    6434557 /usr/lib/x86_64-linux-gnu/libconfig.so.9.2.0
sslh      226566                              sslh  mem       REG               8,34         44064    6423398 /usr/lib/x86_64-linux-gnu/libwrap.so.0.7.6
sslh      226566                              sslh  mem       REG               8,34        625344    6430550 /usr/lib/x86_64-linux-gnu/libpcre2-8.so.0.11.2
sslh      226566                              sslh  mem       REG               8,34        236616    6423087 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
sslh      226566                              sslh    0r      CHR                1,3           0t0          5 /dev/null
sslh      226566                              sslh    1u     unix 0xffff9f384bc8e000           0t0  228018983 type=STREAM (CONNECTED)
sslh      226566                              sslh    2u     unix 0xffff9f384bc8e000           0t0  228018983 type=STREAM (CONNECTED)
sslh      226566                              sslh    3u     IPv4          227718734           0t0        TCP *:https (LISTEN)
sslh      226566                              sslh    4u     IPv4          227718737           0t0        UDP *:https
sslh      226566                              sslh    5u  a_inode               0,15             0       1073 [eventpoll:3,4,6,8,9,221,492,493,495,527,625,632]
sslh      226566                              sslh    6u  a_inode               0,15             0       1073 [eventfd:13]
sslh      226566                              sslh    7u     IPv4          232477516           0t0        UDP *:35176
sslh      226566                              sslh    8u     IPv4          232873671           0t0        TCP 192.168.0.7:https->[removed] (ESTABLISHED)
sslh      226566                              sslh    9u     IPv4          232873674           0t0        TCP localhost:39574->localhost:441 (ESTABLISHED)
sslh      226566                              sslh  221u     IPv4          230410684           0t0        TCP 192.168.0.7:https->[removed] (ESTABLISHED)
sslh      226566                              sslh  492u     IPv4          230415619           0t0        TCP 192.168.0.7:https->[removed]  (ESTABLISHED)
sslh      226566                              sslh  493u     IPv4          230414994           0t0        TCP 192.168.0.7:https->[removed]  (ESTABLISHED)
sslh      226566                              sslh  495u     IPv4          230415626           0t0        TCP 192.168.0.7:https->[removed]  (ESTABLISHED)
sslh      226566                              sslh  527u     IPv4          230415764           0t0        TCP 192.168.0.7:https->[removed]  (ESTABLISHED)
sslh      226566                              sslh  625u     IPv4          230561327           0t0        TCP 192.168.0.7:https->[removed]  (ESTABLISHED)
sslh      226566                              sslh  632u     IPv4          230560766           0t0        TCP 192.168.0.7:https->[removed]  (ESTABLISHED)

@yrutschle
Copy link
Owner

yrutschle commented Jul 16, 2024 via email

@dm9bbadd4
Copy link
Author

Next time I get the error I'll post it here. Seems to happen every few days

@dm9bbadd4
Copy link
Author

Error happened again. I don't know how helpful these logs will be.

lsof | grep sslh
sslh      226566                              sslh  491u     IPv4          235984574           0t0        TCP 192.168.0.7:https->[removed]:54539 (CLOSE_WAIT)
sslh      226566                              sslh  492u     IPv4          230415619           0t0        TCP 192.168.0.7:https->[removed]:60759 (ESTABLISHED)
sslh      226566                              sslh  493u     IPv4          230414994           0t0        TCP 192.168.0.7:https->[removed]:55735 (ESTABLISHED)
sslh      226566                              sslh  494u     IPv4          235984585           0t0        TCP 192.168.0.7:https->[removed]:55273 (CLOSE_WAIT)
sslh      226566                              sslh  495u     IPv4          230415626           0t0        TCP 192.168.0.7:https->[removed]:38757 (ESTABLISHED)
sslh      226566                              sslh  496u     IPv4          235984596           0t0        TCP 192.168.0.7:https->[removed]:54639 (CLOSE_WAIT)
sslh      226566                              sslh  497u     IPv4          235984607           0t0        TCP 192.168.0.7:https->[removed]:56965 (CLOSE_WAIT)
sslh      226566                              sslh  498u     IPv4          235984613           0t0        TCP 192.168.0.7:https->[removed]:38565 (CLOSE_WAIT)
sslh      226566                              sslh  499u     IPv4          235984629           0t0        TCP 192.168.0.7:https->[removed]:12335 (CLOSE_WAIT)
sslh      226566                              sslh  500u     IPv4          235984645           0t0        TCP 192.168.0.7:https->[removed]:41167 (CLOSE_WAIT)
sslh      226566                              sslh  501u     IPv4          235986652           0t0        TCP 192.168.0.7:https->[removed]:24169 (CLOSE_WAIT)
sslh      226566                              sslh  502u     IPv4          235987051           0t0        TCP 192.168.0.7:https->[removed]:64383 (CLOSE_WAIT)
sslh      226566                              sslh  503u     IPv4          235987069           0t0        TCP 192.168.0.7:https->[removed]:57275 (CLOSE_WAIT)
strace
write(2, "tcp-listener.c:131:accept:24:Too"..., 49) = 49
socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = -1 EMFILE (Too many open files)
openat(AT_FDCWD, "/dev/console", O_WRONLY|O_NOCTTY|O_CLOEXEC) = -1 EMFILE (Too many open files)
epoll_wait(5, [{events=EPOLLIN, data={u32=3, u64=167503724547}}], 1704, 59743) = 1
accept(3, NULL, NULL)                   = -1 EMFILE (Too many open files)
write(2, "tcp-listener.c:131:accept:24:Too"..., 49) = 49
socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = -1 EMFILE (Too many open files)
openat(AT_FDCWD, "/dev/console", O_WRONLY|O_NOCTTY|O_CLOEXEC) = -1 EMFILE (Too many open files)
epoll_wait(5, [{events=EPOLLIN, data={u32=3, u64=167503724547}}], 1704, 59743) = 1
accept(3, NULL, NULL)                   = -1 EMFILE (Too many open files)
write(2, "tcp-listener.c:131:accept:24:Too"..., 49) = 49
socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = -1 EMFILE (Too many open files)
openat(AT_FDCWD, "/dev/console", O_WRONLY|O_NOCTTY|O_CLOEXEC) = -1 EMFILE (Too many open files)
epoll_wait(5, [{events=EPOLLIN, data={u32=3, u64=167503724547}}], 1704, 59743) = 1
accept(3, NULL, NULL)                   = -1 EMFILE (Too many open files)
write(2, "tcp-listener.c:131:accept:24:Too"..., 49) = 49
socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = -1 EMFILE (Too many open files)
openat(AT_FDCWD, "/dev/console", O_WRONLY|O_NOCTTY|O_CLOEXEC) = -1 EMFILE (Too many open files)
epoll_wait(5, [{events=EPOLLIN, data={u32=3, u64=167503724547}}], 1704, 59743) = 1
accept(3, NULL, NULL)                   = -1 EMFILE (Too many open files)
write(2, "tcp-listener.c:131:accept:24:Too"..., 49) = 49
socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = -1 EMFILE (Too many open files)
openat(AT_FDCWD, "/dev/console", O_WRONLY|O_NOCTTY|O_CLOEXEC) = -1 EMFILE (Too many open files)
epoll_wait(5, [{events=EPOLLIN, data={u32=3, u64=167503724547}}], 1704, 59743) = 1
^Caccept(3, NULL, NULLstrace: Process 226566 detached
<detached ...>

@yrutschle
Copy link
Owner

yrutschle commented Jul 19, 2024 via email

@ftasnetamot
Copy link
Contributor

How does your configuration look like? How many targets? Some redirects?
Just asking from curiosity, as I run in this message during a test, but there a message like that was to expect.

My configuration looked like: sslh:1->nginx:2->sslh:3->nginx:4->sslh:5->nginx:6.... up to nginx:40 as final destination.
And than I throw around 25 parallel connections on that construct.

@dm9bbadd4
Copy link
Author

I was trying to run strace writing to a log file until it errored but stopped writing after a time. My config is pretty small, one udp and one tcp port (443) being listened on and only 4 redirects with one of them being to nginx. Again this issue only started after I updated to Ubuntu server 24 LTS.

@ftasnetamot
Copy link
Contributor

Have a look at #468
Can you tell me details of the system, how you have compiled sslh?
Give me the output of ls -lad /proc/XXXXX/fd/*, where XXXX is the PID of the leading sslh process.
Furthermore, if you have self-compiled sslh the output of gcc --version

@dm9bbadd4
Copy link
Author

When running ls -lad /proc/[SSLH-PID]/fd/* I get ls: cannot access '/proc/393573/fd/*': No such file or directory whether the too many open files error is happening or not.
The version of SSLH I am using is sslh-ev head-2024-07-08 and the make file options are

ENABLE_SANITIZER= # Enable ASAN/LSAN/UBSAN
ENABLE_REGEX=1  # Enable regex probes
USELIBCONFIG=1  # Use libconfig? (necessary to use configuration files)
USELIBWRAP?=1   # Use libwrap?
USELIBCAP=1     # Use libcap?
USESYSTEMD=1     # Make use of systemd socket activation
USELIBBSD?=     # Use libbsd (needed to update process name in `ps`)
COV_TEST=       # Perform test coverage?
PREFIX?=/usr
BINDIR?=$(PREFIX)/sbin
MANDIR?=$(PREFIX)/share/man/man8

gcc --version output gcc (Ubuntu 13.2.0-23ubuntu4) 13.2.0

@ftasnetamot
Copy link
Contributor

ftasnetamot commented Aug 25, 2024

Can you double-check and do the ls as root/sudo? There MUST be filehandles available for each running process.
I figured out, that there is some -right now not identifies- issue, with newer compile chains, that each running sslh process has six additional filehandles open.
You already answered my question with the sslh version, as this shows, that you are running a self-compiled sslh.
When I compiled it under ubuntu 24.04 I had this issue.
I bet, you will see those filehandles (maybe with other fd-ids), when doing the ls right.

l--------- 1 root root 64 24. Aug 14:42 /proc/14290/fd/6 -> /lib
l--------- 1 root root 64 24. Aug 14:42 /proc/14290/fd/7 -> /usr/lib
l--------- 1 root root 64 24. Aug 14:42 /proc/14290/fd/8 -> /etc/ld.so.cache
l--------- 1 root root 64 24. Aug 14:42 /proc/14290/fd/9 -> /etc/hosts
l--------- 1 root root 64 24. Aug 14:42 /proc/14290/fd/10 -> /run/resolvconf/resolv.conf
l--------- 1 root root 64 24. Aug 14:42 /proc/14290/fd/11 -> /etc/nsswitch.conf

And that could explain your issue, when sslh has instead of 4 or 5, 10 or 11 handles open!
I try to dig into the issue, but currently I am clueless. It must have something todo with dns-lookup.

If you see those handles, a possible workaround could be, compiling sslh on a system with an older gcc-chain. For me it worked for example under debian bullseye with gcc (Debian 10.2.1-6) 10.2.1 20210110
Also ok: Compiling under Ubuntu 22.04 (5.15.0-119-generic) with gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

@dm9bbadd4
Copy link
Author

Running sudo ls -lad /proc/[SSLH-PID]/fd/* still gives the same error. Running sudo ls -la /proc/1285/fd/ shows:

total 0
dr-x------ 2 sslh sslh 10 Aug 24 23:04 .
dr-xr-xr-x 9 sslh sslh  0 Aug 24 23:04 ..
lr-x------ 1 sslh sslh 64 Aug 24 23:04 0 -> /dev/null
lrwx------ 1 sslh sslh 64 Aug 24 23:04 1 -> 'socket:[13404]'
lrwx------ 1 sslh sslh 64 Aug 24 23:04 2 -> 'socket:[13404]'
lrwx------ 1 sslh sslh 64 Aug 24 23:04 3 -> 'socket:[10883]'
lrwx------ 1 sslh sslh 64 Aug 24 23:04 4 -> 'socket:[13568]'
lrwx------ 1 sslh sslh 64 Aug 25 16:57 5 -> 'anon_inode:[eventpoll]'
lrwx------ 1 sslh sslh 64 Aug 25 16:57 6 -> 'anon_inode:[eventfd]'
lrwx------ 1 sslh sslh 64 Aug 25 16:57 7 -> 'socket:[1763451]'
lrwx------ 1 sslh sslh 64 Aug 25 16:57 8 -> 'socket:[1763454]'
lrwx------ 1 sslh sslh 64 Aug 24 23:04 9 -> 'socket:[1460918]'

I don't have another system I could compile this on unfortunately. If you were to share your compiled version that might help but I don't know if that's allowed.

@ftasnetamot
Copy link
Contributor

I found the issue for the filehandle problem I detected. That was an side effect of landlock, so unfortunately not your problem.
I only remembered your issue. And as you are using the -ev version, this effect would have impacted you by far not as strong, as the -fork users where impacted.
So sorry for you, but it was worth also digging in here.

@dm9bbadd4
Copy link
Author

If anyone else is experiencing this issue, my workaround is to restart sslh everyday but editing the service file (ubuntu) and setting

[Service]
Restart=always
RuntimeMaxSec=1d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants