Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lxcfs terminated with signal SIGSEGV #656

Open
4 tasks done
mchtech opened this issue Aug 12, 2024 · 2 comments
Open
4 tasks done

lxcfs terminated with signal SIGSEGV #656

mchtech opened this issue Aug 12, 2024 · 2 comments
Assignees

Comments

@mchtech
Copy link
Contributor

mchtech commented Aug 12, 2024

The template below is mostly useful for bug reports and support questions.
Feel free to remove anything which doesn't apply to you and add more information where it makes sense.

Required information

  • Distribution:
    • AlmaLinux 9.3
    • Rocky Linux 8.7
  • LXCFS version: 5.0.2 (fuse2)
  • The output of
    • uname -a
      • Linux *** 5.14.0-362.24.2.el9_3.x86_64 # 1 SMP PREEMPT_DYNAMIC Sat Mar 30 14:11:54 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux
      • Linux *** 5.16.11-1.el8.elrepo.x86_64 # 1 SMP PREEMPT Tue Feb 22 10:29:18 EST 2022 x86_64 x86_64 x86_64 GNU/Linux
    • cat /proc/1/mounts
    • ps aux | grep lxcfs
      • /usr/local/bin/lxcfs /var/lib/lxcfs/ --enable-cfs --enable-pidfd --enable-loadavg
    • LXCFS logs
       '/lxcfs/fusermount' -> '/usr/local/bin/fusermount'
       mkdir: created directory '/usr/local/lib64/lxcfs'
       '/lxcfs/lxcfs' -> '/usr/local/bin/lxcfs'
       '/lxcfs/liblxcfs.so' -> '/usr/local/lib64/lxcfs/liblxcfs.so'
       '/lxcfs/libfuse.so.2.9.2' -> '/usr/lib64/libfuse.so.2.9.2'
       '/usr/lib64/libfuse.so.2' -> '/usr/lib64/libfuse.so.2.9.2'
       '/lxcfs/libulockmgr.so.1.0.1' -> '/usr/lib64/libulockmgr.so.1.0.1'
       '/usr/lib64/libulockmgr.so.1' -> '/usr/lib64/libulockmgr.so.1.0.1'
       Running constructor lxcfs_init to reload liblxcfs
       mount namespace: 5
       hierarchies:
         0: fd:   6: cpuset,cpu,io,memory,hugetlb,pids,rdma,misc
       Kernel supports pidfds
       Kernel does not support swap accounting
       api_extensions:
       - cgroups
       - sys_cpu_online
       - proc_cpuinfo
       - proc_diskstats
       - proc_loadavg
       - proc_meminfo
       - proc_stat
       - proc_swaps
       - proc_uptime
       - proc_slabinfo
       - shared_pidns
       - cpuview_daemon
       - loadavg_daemon
       - pidfds
       ../src/proc_loadavg.c: 388: refresh_load: Failed to open "/proc/1765502/task"
       ../src/proc_loadavg.c: 388: refresh_load: Failed to open "/proc/1771491/task"
       ../src/proc_loadavg.c: 388: refresh_load: Failed to open "/proc/2037031/task"
       ............................................................................
       ../src/proc_loadavg.c: 388: refresh_load: Failed to open "/proc/1565585/task"
       ../src/proc_loadavg.c: 388: refresh_load: Failed to open "/proc/1565600/task"
       

Issue description

lxcfs terminated with signal SIGSEGV after running some days/months (7000+ nodes, crash 1-2 lxcfs per month)

case 1 (lxcfs-5.0.2 fuse2 on almalinux 9.3)

$ gdb /usr/local/bin/lxcfs /data0/lxcfs/core.lxcfs.0.87a60023a4904073b7dddd36c60387e7.20905.1723098716000000
GNU gdb (GDB) Red Hat Enterprise Linux 10.2-13.el9
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/local/bin/lxcfs...
[New LWP 714018]
[New LWP 20905]
[New LWP 21061]
[New LWP 714587]
[New LWP 714556]
[New LWP 728323]
[New LWP 728330]
[New LWP 728324]
[New LWP 728346]
[New LWP 728345]
[New LWP 728321]
[New LWP 728342]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/local/bin/lxcfs /var/lib/lxcfs/ --enable-cfs --enable-pidfd --enable-loada'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __vsnprintf_internal (string=0x0, maxlen=<optimized out>, format=0x7f6fb4dd02e8 "0-%d\n", args=args@entry=0x7f6f0a7fb850, mode_flags=mode_flags@entry=0) at vsnprintf.c:112
112	  string[0] = '\0';
[Current thread is 1 (Thread 0x7f6f0a7fc640 (LWP 714018))]
(gdb) set pagination off
(gdb) bt full
#0  __vsnprintf_internal (string=0x0, maxlen=<optimized out>, format=0x7f6fb4dd02e8 "0-%d\n", args=args@entry=0x7f6f0a7fb850, mode_flags=mode_flags@entry=0) at vsnprintf.c:112
        sf = {f = {_sbf = {_f = {_flags = -72515584, _IO_read_ptr = 0x0, _IO_read_end = 0x0, _IO_read_base = 0x0, _IO_write_base = 0x0, _IO_write_ptr = 0x0, _IO_write_end = 0x0, _IO_buf_base = 0x0, _IO_buf_end = 0x0, _IO_save_base = 0x0, _IO_backup_base = 0x0, _IO_save_end = 0x0, _markers = 0x0, _chain = 0x0, _fileno = 0, _flags2 = 128, _old_offset = 8803452003510089728, _cur_column = 0, _vtable_offset = 127 '\177', _shortbuf = "\n", _lock = 0x0, _offset = 2, _codecvt = 0x0, _wide_data = 0xffffffffffffffff, _freeres_list = 0x0, _freeres_buf = 0x1999999999999999, __pad5 = 0, _mode = -1, _unused2 = "\000\000\000\000\300\"\000\\\000\000\000\000\340\062\000\\n\177\000"}, vtable = 0x7f6fb4bf72e0 <_IO_strn_jumps>}, _s = {_allocate_buffer_unused = 0x7f6f0a7fb7d0, _free_buffer_unused = 0x7f6f0a7fb808}}, overflow_buf = "0\274\177\no\177\000\000\030\270\177\n\002\000\000\000`\270\177\no\177\000\000-\365ڴo\177\000\000\060\f\000\\n\177\000\000\220\270\177\no\177\000\000\220\061\000\\n\177\000\000p\312\023\263\333U\000"}
        ret = <optimized out>
#1  0x00007f6fb4a6f596 in __GI___snprintf (s=<optimized out>, maxlen=<optimized out>, format=<optimized out>) at snprintf.c:31
        arg = {{gp_offset = 24, fp_offset = 48, overflow_arg_area = 0x7f6f0a7fb930, reg_save_area = 0x7f6f0a7fb870}}
        done = <optimized out>
#2  0x00007f6fb4dc88dd in do_cpuset_read (cg=0x7f6e5c003190 "/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod2f8452ff_747f_4657_a644_08170f3c287e.slice/cri-containerd-e2cc84a07ba207bc2b8cb2f4ead9e4bfec10630d562418ffb9107c6eab7ef0ac.scope", buf=0x0, buflen=518) at ../src/sysfs_fuse.c:218
        cpuset = 0x7f6e5c026b00 "38-45,114-121"
        fc = 0x7f6e5c000bf0
        opts = 0x55dbb313c2a0
        max_cpus = 16
        total_len = 0
        use_view = true
        __func__ = "do_cpuset_read"
#3  0x00007f6fb4dc8b70 in sys_devices_system_cpu_online_read (buf=0x7f6e5c005130 "\020\b", size=8192, offset=0, fi=0x7f6f0a7fbc30) at ../src/sysfs_fuse.c:266
        cg = 0x7f6e5c003190 "/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod2f8452ff_747f_4657_a644_08170f3c287e.slice/cri-containerd-e2cc84a07ba207bc2b8cb2f4ead9e4bfec10630d562418ffb9107c6eab7ef0ac.scope"
        fc = 0x7f6e5c000bf0
        d = 0x7f6ea4009cf0
        cache = 0x0
        initpid = 2014638
        total_len = 0
#4  0x00007f6fb4dca7cd in sys_read (path=0x7f6e5c000c30 "/sys/devices/system/cpu/online", buf=0x7f6e5c005130 "\020\b", size=8192, offset=0, fi=0x7f6f0a7fbc30) at ../src/sysfs_fuse.c:832
        f = 0x7f6ea4009cf0
#5  0x000055dbb1215fa0 in do_sys_read (path=0x7f6e5c000c30 "/sys/devices/system/cpu/online", buf=0x7f6e5c005130 "\020\b", size=8192, offset=0, fi=0x7f6f0a7fbc30) at ../src/lxcfs.c:288
        error = 0x0
        __sys_read = 0x7f6fb4dca73c <sys_read>
        __func__ = "do_sys_read"
#6  0x000055dbb12175a5 in lxcfs_read (path=0x7f6e5c000c30 "/sys/devices/system/cpu/online", buf=0x7f6e5c005130 "\020\b", size=8192, offset=0, fi=0x7f6f0a7fbc30) at ../src/lxcfs.c:846
        ret = 0
#7  0x00007f6fb4e0d537 in fuse_fs_read_buf (fs=0x55dbb313f970, path=0x7f6e5c000c30 "/sys/devices/system/cpu/online", bufp=bufp@entry=0x7f6f0a7fbb90, size=size@entry=8192, off=off@entry=0, fi=fi@entry=0x7f6f0a7fbc30) at fuse.c:1792
        buf = <optimized out>
        mem = <optimized out>
        res = <optimized out>
#8  0x00007f6fb4e0d712 in fuse_lib_read (req=0x7f6e5c001750, ino=14, size=8192, off=0, fi=0x7f6f0a7fbc30) at fuse.c:3250
        d = {id = 214748364809, cond = {__data = {__lock = 0, __futex = 0, __total_seq = 0, __wakeup_seq = 0, __woken_seq = 511101108348, __mutex = 0x5b0000006e, __nwaiters = 8, __broadcast_seq = 0}, __size = '\000' <repeats 24 times>, "|\000\000\000w\000\000\000n\000\000\000[\000\000\000\b\000\000\000\000\000\000", __align = 0}, finished = 676688}
        f = 0x55dbb313f810
        buf = 0x7f6e5c0017e0
        path = 0x7f6e5c000c30 "/sys/devices/system/cpu/online"
        res = <optimized out>
#9  0x00007f6fb4e160ce in do_read (req=<optimized out>, nodeid=<optimized out>, inarg=<optimized out>) at fuse_lowlevel.c:1232
        fi = {flags = 32768, fh_old = 140113174633712, writepage = 0, direct_io = 0, keep_cache = 0, flush = 0, nonseekable = 0, flock_release = 0, padding = 0, fh = 140113174633712, lock_owner = 16163897521168180545}
        arg = <optimized out>
#10 0x00007f6fb4e16b6b in fuse_ll_process_buf (data=0x55dbb313fb00, buf=0x7f6f0a7fbe00, ch=<optimized out>) at fuse_lowlevel.c:2441
        f = 0x55dbb313fb00
        write_header_size = 80
        bufv = {count = 1, idx = 0, off = 0, buf = {{size = 64, flags = 0, mem = 0x7f6e74000ba0, fd = 0, pos = 0}}}
        tmpbuf = {count = 1, idx = 0, off = 0, buf = {{size = 80, flags = 0, mem = 0x0, fd = -1, pos = 0}}}
        in = 0x7f6e74000ba0
        inarg = 0x7f6e74000bc8
        req = <optimized out>
        mbuf = 0x0
        err = <optimized out>
        res = <optimized out>
#11 0x00007f6fb4e13401 in fuse_do_work (data=0x7f6e74000b60) at fuse_loop_mt.c:117
        isforget = 0
        ch = 0x55dbb313f570
        fbuf = {size = 64, flags = 0, mem = 0x7f6e74000ba0, fd = 0, pos = 0}
        res = <optimized out>
        w = 0x7f6e74000b60
        mt = 0x7ffd24fb6470
#12 0x00007f6fb4a9f802 in start_thread (arg=<optimized out>) at pthread_create.c:443
        ret = <optimized out>
        pd = <optimized out>
        out = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140114902632976, 7213687130270957455, 140114894243392, 0, 140117749134640, 0, -7295199422699569265, -7294912847722348657}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#13 0x00007f6fb4a3f450 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
No locals.
quit)

case 2 (lxcfs-5.0.2 fuse2 on rockylinux 8.7)

$ gdb /usr/local/bin/lxcfs /data0/lxcfs/core.lxcfs.0.1d91adb8092f4bee91b7555e1497bbca.54606.1723315511000000
GNU gdb (GDB) Rocky Linux 8.2-20.el8.0.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/local/bin/lxcfs...done.
[New LWP 2259601]
[New LWP 54606]
[New LWP 54632]
[New LWP 2168244]
[New LWP 2259604]
[New LWP 2259612]
[New LWP 2259613]
[New LWP 2259616]
[New LWP 2259619]
[New LWP 2259625]
[New LWP 2259630]
[New LWP 2259638]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/local/bin/lxcfs /var/lib/lxcfs/ --enable-cfs --enable-pidfd --enable-loada'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  _IO_vsnprintf (string=0x0, maxlen=<optimized out>, format=0x7f9d8da425d3 "%s", args=args@entry=0x7f9cc3ffe710) at vsnprintf.c:112
112	  string[0] = '\0';
[Current thread is 1 (Thread 0x7f9cc3fff700 (LWP 2259601))]
(gdb) set pagination off
(gdb) bt full
#0  _IO_vsnprintf (string=0x0, maxlen=<optimized out>, format=0x7f9d8da425d3 "%s", args=args@entry=0x7f9cc3ffe710) at vsnprintf.c:112
        sf = {f = {_sbf = {_f = {_flags = -72515584, _IO_read_ptr = 0x0, _IO_read_end = 0x0, _IO_read_base = 0x0, _IO_write_base = 0x0, _IO_write_ptr = 0x0, _IO_write_end = 0x0, _IO_buf_base = 0x0, _IO_buf_end = 0x0, _IO_save_base = 0x0, _IO_backup_base = 0x0, _IO_save_end = 0x0, _markers = 0x0, _chain = 0x0, _fileno = 0, _flags2 = 128, _old_offset = 140314649123093, _cur_column = 0, _vtable_offset = -1 '\377', _shortbuf = <incomplete sequence \303>, _lock = 0x0, _offset = 140312214077441, _codecvt = 0x7f9d8d8422ed <check_match+109>, _wide_data = 0xffffffffffffffff, _freeres_list = 0x0, _freeres_buf = 0x7f9cc3ffe704, __pad5 = 140314651861152, _mode = -1, _unused2 = "\000\000\000\000\220\r\000P\233\177\000\000\000\004\000\000\000\000\000"}, vtable = 0x7f9d8cfb20a0 <_IO_strn_jumps>}, _s = {_allocate_buffer_unused = 0x15, _free_buffer_unused = 0xf0000109}}, overflow_buf = "\001\000\000\000\000\000\000\000$\201", '\000' <repeats 31 times>, "\004", '\000' <repeats 14 times>, "\236U\263f\000\000\000"}
        ret = <optimized out>
#1  0x00007f9d8cc64ab3 in __GI___snprintf (s=<optimized out>, maxlen=<optimized out>, format=<optimized out>) at snprintf.c:33
        arg = {{gp_offset = 24, fp_offset = 48, overflow_arg_area = 0x7f9cc3ffe7f0, reg_save_area = 0x7f9cc3ffe730}}
        done = <optimized out>
#2  0x00007f9d8da3d7be in read_file_fuse (path=0x7f9d8da412b9 "/proc/meminfo", buf=0x7f9b50006720 "MemTotal:       251396336 kB\nMemFree:        251344560 kB\nMemAvailable:   251347404 kB\nBuffers:", ' ' <repeats 15 times>, "0 kB\nCached:", ' ' <repeats 13 times>, "2844 kB\nSwapCached:", ' ' <repeats 12 times>, "0 kB\nActive:", ' ' <repeats 13 times>, "2864 kB\nI"..., size=8191, d=0x7f9cb002b010) at ../src/utils.c:312
        l = 140311279954032
        line = 0x7f9b50004770 "MemTotal:       263454960 kB\n"
        f = 0x7f9b50000d90
        linelen = 120
        total_len = 0
        cache = 0x0
        cache_size = 1942
        __func__ = "read_file_fuse"
#3  0x00007f9d8da3630b in proc_meminfo_read (buf=0x7f9b50006720 "MemTotal:       251396336 kB\nMemFree:        251344560 kB\nMemAvailable:   251347404 kB\nBuffers:", ' ' <repeats 15 times>, "0 kB\nCached:", ' ' <repeats 13 times>, "2844 kB\nSwapCached:", ' ' <repeats 12 times>, "0 kB\nActive:", ' ' <repeats 13 times>, "2864 kB\nI"..., size=8191, offset=0, fi=0x7f9cc3ffecd0) at ../src/proc_fuse.c:1244
        cgroup = 0x0
        line = 0x0
        memusage_str = 0x0
        memswusage_str = 0x0
        memswpriority_str = 0x0
        fopen_cache = 0x0
        f = 0x0
        fc = 0x7f9b50002ba0
        wants_swap = false
        d = 0x7f9cb002b010
        memlimit = 0
        memusage = 0
        hosttotal = 0
        swfree = 0
        swusage = 0
        swtotal = 0
        memswpriority = 1
        mstat = {hierarchical_memory_limit = 0, hierarchical_memsw_limit = 0, total_cache = 0, total_rss = 0, total_rss_huge = 0, total_shmem = 0, total_mapped_file = 0, total_dirty = 0, total_writeback = 0, total_swap = 0, total_pgpgin = 0, total_pgpgout = 0, total_pgfault = 0, total_pgmajfault = 0, total_inactive_anon = 0, total_active_anon = 0, total_inactive_file = 0, total_active_file = 0, total_unevictable = 0}
        linelen = 0
        total_len = 0
        cache = 0x0
        cache_size = 1942
        ret = 1667723888
        initpid = 2267284
        __func__ = "proc_meminfo_read"
#4  0x00007f9d8da378b2 in proc_read (path=0x7f9b50000c30 "/proc/meminfo", buf=0x7f9b50006720 "MemTotal:       251396336 kB\nMemFree:        251344560 kB\nMemAvailable:   251347404 kB\nBuffers:", ' ' <repeats 15 times>, "0 kB\nCached:", ' ' <repeats 13 times>, "2844 kB\nSwapCached:", ' ' <repeats 12 times>, "0 kB\nActive:", ' ' <repeats 13 times>, "2864 kB\nI"..., size=8191, offset=0, fi=0x7f9cc3ffecd0) at ../src/proc_fuse.c:1511
        f = 0x7f9cb002b010
#5  0x000056360c2f6ef7 in do_proc_read (path=0x7f9b50000c30 "/proc/meminfo", buf=0x7f9b50006720 "MemTotal:       251396336 kB\nMemFree:        251344560 kB\nMemAvailable:   251347404 kB\nBuffers:", ' ' <repeats 15 times>, "0 kB\nCached:", ' ' <repeats 13 times>, "2844 kB\nSwapCached:", ' ' <repeats 12 times>, "0 kB\nActive:", ' ' <repeats 13 times>, "2864 kB\nI"..., size=8191, offset=0, fi=0x7f9cc3ffecd0) at ../src/lxcfs.c:272
        error = 0x0
        __proc_read = 0x7f9d8da37836 <proc_read>
        __func__ = "do_proc_read"
#6  0x000056360c2f8558 in lxcfs_read (path=0x7f9b50000c30 "/proc/meminfo", buf=0x7f9b50006720 "MemTotal:       251396336 kB\nMemFree:        251344560 kB\nMemAvailable:   251347404 kB\nBuffers:", ' ' <repeats 15 times>, "0 kB\nCached:", ' ' <repeats 13 times>, "2844 kB\nSwapCached:", ' ' <repeats 12 times>, "0 kB\nActive:", ' ' <repeats 13 times>, "2864 kB\nI"..., size=8191, offset=0, fi=0x7f9cc3ffecd0) at ../src/lxcfs.c:839
        ret = 0
#7  0x00007f9d8d400537 in fuse_fs_read_buf (fs=0x56360df91aa0, path=0x7f9b50000c30 "/proc/meminfo", bufp=bufp@entry=0x7f9cc3ffec30, size=size@entry=8191, off=off@entry=0, fi=fi@entry=0x7f9cc3ffecd0) at fuse.c:1792
        buf = <optimized out>
        mem = <optimized out>
        res = <optimized out>
#8  0x00007f9d8d400712 in fuse_lib_read (req=0x7f9b50001220, ino=4, size=8191, off=0, fi=0x7f9cc3ffecd0) at fuse.c:3250
        d = {id = 18446744073709551536, cond = {__data = {__lock = 2, __futex = 0, __total_seq = 214748364809, __wakeup_seq = 532575944795, __woken_seq = 0, __mutex = 0x0, __nwaiters = 0, __broadcast_seq = 0}, __size = "\002\000\000\000\000\000\000\000\t\000\000\000\062\000\000\000[\000\000\000|", '\000' <repeats 26 times>, __align = 2}, finished = 119}
        f = 0x56360df91940
        buf = 0x7f9b500012b0
        path = 0x7f9b50000c30 "/proc/meminfo"
        res = <optimized out>
#9  0x00007f9d8d4090ce in do_read (req=<optimized out>, nodeid=<optimized out>, inarg=<optimized out>) at fuse_lowlevel.c:1232
        fi = {flags = 32768, fh_old = 140310944591888, writepage = 0, direct_io = 0, keep_cache = 0, flush = 0, nonseekable = 0, flock_release = 0, padding = 0, fh = 140310944591888, lock_owner = 3740103517284196512}
        arg = <optimized out>
#10 0x00007f9d8d409b6b in fuse_ll_process_buf (data=0x56360df91c30, buf=0x7f9cc3ffeea0, ch=<optimized out>) at fuse_lowlevel.c:2441
        f = 0x56360df91c30
        bufv = {count = 1, idx = 0, off = 0, buf = {{size = 48, flags = (unknown: 0), mem = 0x7f9bf4000de0, fd = 0, pos = 0}}}
        tmpbuf = {count = 1, idx = 0, off = 0, buf = {{size = 80, flags = (unknown: 0), mem = 0x0, fd = -1, pos = 0}}}
        in = 0x7f9bf4000de0
        inarg = 0x7f9bf4000e08
        req = <optimized out>
        mbuf = 0x0
        err = <optimized out>
        res = <optimized out>
#11 0x00007f9d8d406401 in fuse_do_work (data=0x7f9ccc002b70) at fuse_loop_mt.c:117
        isforget = 0
        ch = 0x56360df916a0
        fbuf = {size = 48, flags = (unknown: 0), mem = 0x7f9bf4000de0, fd = 0, pos = 0}
        res = <optimized out>
        w = 0x7f9ccc002b70
        mt = 0x7ffe50b96280
#12 0x00007f9d8cfc31cf in start_thread (arg=<optimized out>) at pthread_create.c:479
        ret = <optimized out>
        pd = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140311279957760, 7529743667701061605, 140314090134606, 140314090134607, 140314090134736, 140311279955904, -7546201157206435867, -7546876285721274395}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#13 0x00007f9d8cc2ee73 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
No locals.
(gdb) quit

Steps to reproduce

randomly crashed

Information to attach

  • any relevant kernel output (dmesg)
# case 1
[Thu Aug  8 14:31:39 2024] lxcfs[714018]: segfault at 0 ip 00007f6fb4a94861 sp 00007f6f0a7fb6c0 error 6 in libc.so.6[7f6fb4a28000+175000] likely on CPU 129 (core 15, socket 1)
[Thu Aug  8 14:31:39 2024] Code: 00 00 4c 89 ef 4c 89 4c 24 08 e8 3a 51 00 00 48 89 e9 4c 89 e2 48 89 ee 48 8d 05 8a 2a 16 00 4c 89 ef 48 89 84 24 e8 00 00 00 <c6> 45 00 00 e8 f6 63 00 00 89 d9 4c 89 fa 4c 89 f6 4c 89 ef e8 96

# case 2
[Sun Aug 11 02:42:06 2024] lxcfs[2259601]: segfault at 0 ip 00007f9d8cc8531d sp 00007f9cc3ffe5a0 error 6 in libc-2.28.so[7f9d8cbf5000+1bc000]
[Sun Aug 11 02:42:06 2024] Code: ba ff ff ff ff be 00 80 00 00 e8 8e 53 00 00 48 89 df 48 89 e9 4c 89 e2 48 8d 05 8e cd 32 00 48 89 ee 48 89 84 24 d8 00 00 00 <c6> 45 00 00 e8 3a 68 00 00 48 89 df 4c 89 f2 4c 89 ee e8 9c 67 fd
@mihalicyn mihalicyn self-assigned this Aug 12, 2024
@mchtech
Copy link
Contributor Author

mchtech commented Aug 17, 2024

case 3 (lxcfs-5.0.2 fuse2 on almalinux 9.3)

$gdb /usr/local/bin/lxcfs /data0/lxcfs/core.lxcfs.0.7e566a3e34d84d4095b98f420f9eb714.36559.1723806344000000
GNU gdb (GDB) Red Hat Enterprise Linux 10.2-13.el9
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/local/bin/lxcfs...
[New LWP 17329]
[New LWP 36559]
[New LWP 3542106]
[New LWP 36632]
[New LWP 3506561]
[New LWP 17320]
[New LWP 17332]
[New LWP 17322]
[New LWP 17338]
[New LWP 17343]
[New LWP 17348]
[New LWP 17356]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/local/bin/lxcfs /var/lib/lxcfs/ --enable-cfs --enable-pidfd --enable-loada'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f34c3caf91e in __GI___libc_free (mem=0x7f3338008) at malloc.c:3235
3235	  if (chunk_is_mmapped (p))                       /* release mmapped memory. */
[Current thread is 1 (Thread 0x7f34abfff640 (LWP 17329))]
(gdb) set pagination off
(gdb) bt full
#0  0x00007f34c3caf91e in __GI___libc_free (mem=0x7f3338008) at malloc.c:3235
        ar_ptr = <optimized out>
        p = <optimized out>
        err = <optimized out>
#1  0x00007f34c4384e59 in do_release_file_info (fi=0x7f34abffec30) at ../src/utils.c:154
        f = 0x7f3338008630
#2  0x00007f34c437a2d7 in proc_release (path=0x7f331c000bf0 "/proc/loadavg", fi=0x7f34abffec30) at ../src/proc_fuse.c:192
No locals.
#3  0x000055bea1eb2b1b in do_proc_release (path=0x7f331c000bf0 "/proc/loadavg", fi=0x7f34abffec30) at ../src/lxcfs.c:564
        error = 0x0
        __proc_release = 0x7f34c437a2bb <proc_release>
        __func__ = "do_proc_release"
#4  0x000055bea1eb3771 in lxcfs_release (path=0x7f331c000bf0 "/proc/loadavg", fi=0x7f34abffec30) at ../src/lxcfs.c:908
        ret = 32563
#5  0x00007f34c400ccb2 in fuse_do_release (f=f@entry=0x55bea3cdc810, ino=ino@entry=4, path=0x7f331c000bf0 "/proc/loadavg", fi=fi@entry=0x7f34abffec30) at fuse.c:3086
        node = <optimized out>
        unlink_hidden = 0
        compatpath = <optimized out>
        __PRETTY_FUNCTION__ = "fuse_do_release"
#6  0x00007f34c400f5a3 in fuse_lib_release (req=0x7f331c002ae0, ino=4, fi=0x7f34abffec30) at fuse.c:3874
        f = 0x55bea3cdc810
        d = {id = 0, cond = {__data = {__lock = 0, __futex = 0, __total_seq = 511101108348, __wakeup_seq = 390842024046, __woken_seq = 8, __mutex = 0xc2f10, __nwaiters = 469762080, __broadcast_seq = 32563}, __size = "\000\000\000\000\000\000\000\000|\000\000\000w\000\000\000n\000\000\000[\000\000\000\b\000\000\000\000\000\000\000\020/\f\000\000\000\000\000 \000\000\034\063\177\000", __align = 0}, finished = 128}
        path = 0x7f331c000bf0 "/proc/loadavg"
        err = <optimized out>
#7  0x00007f34c4015f44 in do_release (req=<optimized out>, nodeid=<optimized out>, inarg=<optimized out>) at fuse_lowlevel.c:1345
        arg = <optimized out>
        fi = {flags = 32768, fh_old = 139857959618096, writepage = 0, direct_io = 0, keep_cache = 0, flush = 0, nonseekable = 0, flock_release = 0, padding = 0, fh = 0, lock_owner = 0}
#8  0x00007f34c4016b6b in fuse_ll_process_buf (data=0x55bea3cdcb00, buf=0x7f34abffee00, ch=<optimized out>) at fuse_lowlevel.c:2441
        f = 0x55bea3cdcb00
        write_header_size = 80
        bufv = {count = 1, idx = 0, off = 0, buf = {{size = 64, flags = 0, mem = 0x7f33f0000c30, fd = 0, pos = 0}}}
        tmpbuf = {count = 1, idx = 0, off = 0, buf = {{size = 80, flags = 0, mem = 0x0, fd = -1, pos = 0}}}
        in = 0x7f33f0000c30
        inarg = 0x7f33f0000c58
        req = <optimized out>
        mbuf = 0x0
        err = <optimized out>
        res = <optimized out>
#9  0x00007f34c4013401 in fuse_do_work (data=0x7f33f0023fe0) at fuse_loop_mt.c:117
        isforget = 0
        ch = 0x55bea3cdc570
        fbuf = {size = 64, flags = 0, mem = 0x7f33f0000c30, fd = 0, pos = 0}
        res = <optimized out>
        w = 0x7f33f0023fe0
        mt = 0x7ffdf572e990
#10 0x00007f34c3c9f802 in start_thread (arg=<optimized out>) at pthread_create.c:443
        ret = <optimized out>
        pd = <optimized out>
        out = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139864548805136, -713775302213435514, 139864200705600, 0, 139864599819568, 0, 608406904453922694, 608459526120079238}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#11 0x00007f34c3c3f450 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
No locals.
(gdb) quit
  • dmesg
Aug 16 19:05:44 kernel: lxcfs[17329]: segfault at 7f3338000 ip 00007f34c3caf91e sp 00007f34abffea80 error 4 in libc.so.6[7f34c3c28000+175000] likely on CPU 90 (core 26, socket 0)
Aug 16 19:05:44 kernel: Code: f9 ff 48 89 fd e9 b7 fd ff ff 66 90 f3 0f 1e fa 48 85 ff 0f 84 9b 00 00 00 55 48 8d 77 f0 53 48 83 ec 18 48 8b 1d e2 a4 14 00 <48> 8b 47 f8 64 8b 2b a8 02 75 37 48 8b 15 68 a4 14 00 64 48 83 3a
  • logs

    '/lxcfs/fusermount' -> '/usr/local/bin/fusermount'
    mkdir: created directory '/usr/local/lib64/lxcfs'
    '/lxcfs/lxcfs' -> '/usr/local/bin/lxcfs'
    '/lxcfs/liblxcfs.so' -> '/usr/local/lib64/lxcfs/liblxcfs.so'
    '/lxcfs/libfuse.so.2.9.2' -> '/usr/lib64/libfuse.so.2.9.2'
    '/usr/lib64/libfuse.so.2' -> '/usr/lib64/libfuse.so.2.9.2'
    '/lxcfs/libulockmgr.so.1.0.1' -> '/usr/lib64/libulockmgr.so.1.0.1'
    '/usr/lib64/libulockmgr.so.1' -> '/usr/lib64/libulockmgr.so.1.0.1'
    Running constructor lxcfs_init to reload liblxcfs
    mount namespace: 5
    hierarchies:
      0: fd:   6: cpuset,cpu,io,memory,hugetlb,pids,rdma,misc
    Kernel supports pidfds
    Kernel does not support swap accounting
    api_extensions:
    - cgroups
    - sys_cpu_online
    - proc_cpuinfo
    - proc_diskstats
    - proc_loadavg
    - proc_meminfo
    - proc_stat
    - proc_swaps
    - proc_uptime
    - proc_slabinfo
    - shared_pidns
    - cpuview_daemon
    - loadavg_daemon
    - pidfds
    ../src/proc_loadavg.c: 388: refresh_load: Failed to open "/proc/1063520/task"
    ../src/proc_loadavg.c: 388: refresh_load: Failed to open "/proc/1748534/task"
    ../src/proc_loadavg.c: 388: refresh_load: Failed to open "/proc/1752991/task"
    ../src/proc_loadavg.c: 388: refresh_load: Failed to open "/proc/1783805/task"
    ......
    
  • coredump
    core.lxcfs.0.7e566a3e34d84d4095b98f420f9eb714.36559.1723806344000000.zst.gz

@mchtech
Copy link
Contributor Author

mchtech commented Sep 27, 2024

case 4: double free or corruption (fasttop) (lxcfs-5.0.2 fuse2 on rockylinux 8.7)

$ gdb /usr/local/bin/lxcfs /data0/lxcfs/core.lxcfs.0.0cba0bdc82154b01a3a094f3bf35779f.41912.1727414072000000
GNU gdb (GDB) Rocky Linux 8.2-20.el8.0.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/local/bin/lxcfs...done.
[New LWP 4016582]
[New LWP 41966]
[New LWP 41912]
[New LWP 3877292]
[New LWP 4016484]
[New LWP 4016494]
[New LWP 4016502]
[New LWP 4016504]
[New LWP 4016555]
[New LWP 4016561]
[New LWP 4016578]
[New LWP 4016583]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/local/bin/lxcfs /var/lib/lxcfs/ --enable-cfs --enable-pidfd --enable-loada'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	  return ret;
[Current thread is 1 (Thread 0x7f4d057fa700 (LWP 4016582))]
(gdb)
(gdb) set pagination off
(gdb)
(gdb) bt full
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
        set = {__val = {16391, 7, 738226624, 0, 139972911661057, 3664408601, 139968781457640, 0, 94304808032056, 94304808031136, 94304804086150, 139973976158337, 0, 0, 2, 0}}
        pid = <optimized out>
        tid = <optimized out>
        ret = <optimized out>
#1  0x00007f4e3a5daea5 in __GI_abort () at abort.c:79
        save_stage = 1
        act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {139971709278080, 139973978186427, 1752421524623658752, 0, 139971709251776, 0, 0, 139971709246880, 1752421524623658752, 0, 139968781458256, 139968781458192, 139968781458184, 139973978385280, 139968781457840, 139968781458096}}, sa_flags = 4096, sa_restorer = 0x7f4d057f99b0}
        sigs = {__val = {32, 0 <repeats 15 times>}}
#2  0x00007f4e3a64a097 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f4e3a7431f0 "%s\n") at ../sysdeps/posix/libc_fatal.c:181
        ap = {{gp_offset = 24, fp_offset = 0, overflow_arg_area = 0x7f4d057f9ac0, reg_save_area = 0x7f4d057f9a50}}
        fd = <optimized out>
        list = <optimized out>
        nlist = <optimized out>
        cp = <optimized out>
        written = <optimized out>
#3  0x00007f4e3a6514ec in malloc_printerr (str=str@entry=0x7f4e3a744e38 "double free or corruption (fasttop)") at malloc.c:5375
No locals.
#4  0x00007f4e3a652f34 in _int_free (av=0x7f4db4000020, p=0x7f4db40249b0, have_lock=<optimized out>) at malloc.c:4279
        idx = 2
        old = <optimized out>
        old2 = <optimized out>
        size = <optimized out>
        fb = 0x7f4db4000040
        nextchunk = <optimized out>
        nextsize = <optimized out>
        nextinuse = <optimized out>
        prevsize = <optimized out>
        bck = <optimized out>
        fwd = <optimized out>
        __PRETTY_FUNCTION__ = "_int_free"
#5  0x00007f4e3b400ec4 in do_release_file_info (fi=0x7f4d057f9cd0) at ../src/utils.c:158
        f = 0x7f4db40249c0
#6  0x00007f4e3b3f62d7 in proc_release (path=0x7f4db4021da0 "/proc/stat", fi=0x7f4d057f9cd0) at ../src/proc_fuse.c:192
No locals.
#7  0x000055c50c58fb1b in do_proc_release (path=0x7f4db4021da0 "/proc/stat", fi=0x7f4d057f9cd0) at ../src/lxcfs.c:564
        error = 0x0
        __proc_release = 0x7f4e3b3f62bb <proc_release>
        __func__ = "do_proc_release"
#8  0x000055c50c590771 in lxcfs_release (path=0x7f4db4021da0 "/proc/stat", fi=0x7f4d057f9cd0) at ../src/lxcfs.c:908
        ret = 32589
#9  0x00007f4e3adc3cb2 in fuse_do_release (f=f@entry=0x55c50c958810, ino=ino@entry=6, path=0x7f4db4021da0 "/proc/stat", fi=fi@entry=0x7f4d057f9cd0) at fuse.c:3086
        node = <optimized out>
        unlink_hidden = 0
        compatpath = <optimized out>
        __PRETTY_FUNCTION__ = "fuse_do_release"
#10 0x00007f4e3adc65a3 in fuse_lib_release (req=0x7f4db4028c90, ino=6, fi=0x7f4d057f9cd0) at fuse.c:3874
        f = 0x55c50c958810
        d = {id = 214748364809, cond = {__data = {__lock = 91, __futex = 124, __total_seq = 0, __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x6e00000077, __nwaiters = 3020065936, __broadcast_seq = 32589}, __size = "[\000\000\000|", '\000' <repeats 27 times>, "w\000\000\000n\000\000\000\220\214\002\264M\177\000", __align = 532575944795}, finished = -1275068384}
        path = 0x7f4db4021da0 "/proc/stat"
        err = <optimized out>
#11 0x00007f4e3adccf44 in do_release (req=<optimized out>, nodeid=<optimized out>, inarg=<optimized out>) at fuse_lowlevel.c:1345
        arg = <optimized out>
        fi = {flags = 32768, fh_old = 139971709258176, writepage = 0, direct_io = 0, keep_cache = 0, flush = 0, nonseekable = 0, flock_release = 0, padding = 0, fh = 0, lock_owner = 0}
#12 0x00007f4e3adcdb6b in fuse_ll_process_buf (data=0x55c50c958b00, buf=0x7f4d057f9ea0, ch=<optimized out>) at fuse_lowlevel.c:2441
        f = 0x55c50c958b00
        bufv = {count = 1, idx = 0, off = 0, buf = {{size = 48, flags = (unknown: 0), mem = 0x7f4d7c000ba0, fd = 0, pos = 0}}}
        tmpbuf = {count = 1, idx = 0, off = 0, buf = {{size = 80, flags = (unknown: 0), mem = 0x0, fd = -1, pos = 0}}}
        in = 0x7f4d7c000ba0
        inarg = 0x7f4d7c000bc8
        req = <optimized out>
        mbuf = 0x0
        err = <optimized out>
        res = <optimized out>
#13 0x00007f4e3adca401 in fuse_do_work (data=0x7f4d7c000b60) at fuse_loop_mt.c:117
        isforget = 0
        ch = 0x55c50c958570
        fbuf = {size = 48, flags = (unknown: 0), mem = 0x7f4d7c000ba0, fd = 0, pos = 0}
        res = <optimized out>
        w = 0x7f4d7c000b60
        mt = 0x7ffff70af460
#14 0x00007f4e3a9871cf in start_thread (arg=<optimized out>) at pthread_create.c:479
        ret = <optimized out>
        pd = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139968781461248, 6912704723403165139, 139968789851214, 139968789851215, 139968789851344, 139968781459392, -6812584297290248749, -6813197044643115565}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#15 0x00007f4e3a5f2e73 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
No locals.
(gdb) quit
  • logs

    '/lxcfs/fusermount' -> '/usr/local/bin/fusermount'
    mkdir: created directory '/usr/local/lib64/lxcfs'
    '/lxcfs/lxcfs' -> '/usr/local/bin/lxcfs'
    '/lxcfs/liblxcfs.so' -> '/usr/local/lib64/lxcfs/liblxcfs.so'
    '/lxcfs/libfuse.so.2.9.2' -> '/usr/lib64/libfuse.so.2.9.2'
    '/usr/lib64/libfuse.so.2' -> '/usr/lib64/libfuse.so.2.9.2'
    '/lxcfs/libulockmgr.so.1.0.1' -> '/usr/lib64/libulockmgr.so.1.0.1'
    '/usr/lib64/libulockmgr.so.1' -> '/usr/lib64/libulockmgr.so.1.0.1'
    Running constructor lxcfs_init to reload liblxcfs
    mount namespace: 5
    hierarchies:
      0: fd:   6: cpuset,cpu,io,memory,hugetlb,pids,rdma,misc
    Kernel supports pidfds
    Kernel does not support swap accounting
    api_extensions:
    - cgroups
    - sys_cpu_online
    - proc_cpuinfo
    - proc_diskstats
    - proc_loadavg
    - proc_meminfo
    - proc_stat
    - proc_swaps
    - proc_uptime
    - proc_slabinfo
    - shared_pidns
    - cpuview_daemon
    - loadavg_daemon
    - pidfds
    ../src/proc_loadavg.c: 388: refresh_load: Failed to open "/proc/422149/task"
    ../src/proc_loadavg.c: 388: refresh_load: Failed to open "/proc/422150/task"
    ../src/proc_loadavg.c: 388: refresh_load: Failed to open "/proc/422151/task"
    ../src/proc_loadavg.c: 388: refresh_load: Failed to open "/proc/422152/task"
    .......................
    ../src/proc_loadavg.c: 388: refresh_load: Failed to open "/proc/401470/task"
    double free or corruption (fasttop)
    
  • coredump
    lxcfs.0.1d91adb8092f4bee91b7555e1497bbca.54606.1723315511000000.lz4.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants