Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tdarr ffmpeg stack trace #1115

Open
marshalleq opened this issue Nov 5, 2024 · 7 comments
Open

Tdarr ffmpeg stack trace #1115

marshalleq opened this issue Nov 5, 2024 · 7 comments

Comments

@marshalleq
Copy link

Please put plugin requests/bugs at: https://github.com/HaveAGitGat/Tdarr_Plugins

I have been trying to pinpoint some system issues lately (running on TrueNAS electric eel) and have just seen tdarr ffmpeg core dumps in dmesg which obviously got me to here. System has 128GB RAM, swappiness is set to 1.

`v2024 Nov 6 01:23:23 Skywalker Process 591446 (tdarr-ffmpeg) of user 568 dumped core.

Stack trace of thread 779:
#0 0x00007f8df1891014 n/a (/usr/lib/x86_64-linux-gnu/libx265.so.199 + 0x93014)
#1 0x00007f8df1893663 n/a (/usr/lib/x86_64-linux-gnu/libx265.so.199 + 0x95663)
#2 0x00007f8df187dc0d n/a (/usr/lib/x86_64-linux-gnu/libx265.so.199 + 0x7fc0d)
#3 0x00007f8df187077a n/a (/usr/lib/x86_64-linux-gnu/libx265.so.199 + 0x7277a)
ELF object binary architecture: AMD x86-64`

The system is a threadripper 2950x which has for a long time server me well with Tdarr. I saw this happening a while back and assumed it was because I'd switched from unraid to opensuse for my encoding node, however I don't think that's why, I think this is just something about later versions of tdarr OR it always happened and I never noticed. Point being this isn't exactly new, though it may be sort of recent. If there are any other kind of logs you wish to see please let me know.

I also received this overnight, which seems to be the folder for core dumps, haven't figured out how to access it yet though.

Quota exceeded on dataset ssd1pool/.system/cores. Used 79.78% (816.96 MiB of 1 GiB)..

To Reproduce
Queue up some files and encode.

Please provide the following information:
Sorry I'm running out to work in a moment and wanted to just get this started, can add all of this later if it's needed.

  • Config files [can be found in /app/configs/ when using Docker or in the /configs folder next to Tdarr_Updater if not using Docker]

  • Job reports: https://docs.tdarr.io/docs/other/job-reports

  • Log files [can be found in /app/logs/ when using Docker or in the /logs folder next to Tdarr_Updater if not using Docker]

-Worker error [can be found on the 'Tdarr' tab by pressing the 'i' button on a failed item in the staged file section or in the transcode error section at the bottom]
There are a lot of these

1
2024-11-06T07:59:07.343Z zfMnvs3mEfo:Node[Obiwan Node 1]:Worker[false-frog]:[Step W07] [C1] Worker [-error-]
2
2024-11-06T07:59:07.344Z zfMnvs3mEfo:Node[Obiwan Node 1]:Worker[false-frog]:Subworker killed
3
2024-11-06T07:59:07.344Z zfMnvs3mEfo:Node[Obiwan Node 1]:Worker[false-frog]:[-error-]
4
2024-11-06T07:59:07.344Z zfMnvs3mEfo:Node[Obiwan Node 1]:Worker[false-frog]:Subworker exited null
5
2024-11-06T07:59:09.386Z zfMnvs3mEfo:Node[Obiwan Node 1]:Worker[false-frog]:[2/2] Delete success
6
2024-11-06T07:59:09.386Z zfMnvs3mEfo:Node[Obiwan Node 1]:Worker[false-frog]:Updating transcode stats

TrueNAS Electric Eel
Safari latest

@marshalleq
Copy link
Author

marshalleq commented Nov 10, 2024

Just adding, that in the host dmesg, I get the below also:

`2024 Nov 10 12:08:58 Skywalker Process 1644959 (tdarr-ffmpeg) of user 568 dumped core.

Stack trace of thread 1789:
#0 0x00007fcfebc62014 n/a (/usr/lib/x86_64-linux-gnu/libx265.so.199 + 0x93014)
#1 0x00007fcfebc64663 n/a (/usr/lib/x86_64-linux-gnu/libx265.so.199 + 0x95663)
#2 0x00007fcfebc4ec0d n/a (/usr/lib/x86_64-linux-gnu/libx265.so.199 + 0x7fc0d)
#3 0x00007fcfebc4177a n/a (/usr/lib/x86_64-linux-gnu/libx265.so.199 + 0x7277a)
ELF object binary architecture: AMD x86-64

2024 Nov 10 12:41:54 Skywalker Process 1671381 (tdarr-ffmpeg) of user 568 dumped core.

Stack trace of thread 2169:
#0 0x00007fa0734d1014 n/a (/usr/lib/x86_64-linux-gnu/libx265.so.199 + 0x93014)
#1 0x00007fa0734d3663 n/a (/usr/lib/x86_64-linux-gnu/libx265.so.199 + 0x95663)
#2 0x00007fa0734bdc0d n/a (/usr/lib/x86_64-linux-gnu/libx265.so.199 + 0x7fc0d)
#3 0x00007fa0734b077a n/a (/usr/lib/x86_64-linux-gnu/libx265.so.199 + 0x7277a)
ELF object binary architecture: AMD x86-64`

@billy-syrett
Copy link

billy-syrett commented Nov 28, 2024

I have almost the same issue. Also just moved from Unraid to TrueNAS Electric Eel (24.10.0.2)

2024 Nov 28 21:57:24 truenas Process 344775 (tdarr-ffmpeg) of user 568 dumped core.

Stack trace of thread 1178:
#0  0x00007fcddc5db9fc n/a (/usr/lib/x86_64-linux-gnu/libc.so.6 + 0x969fc)
ELF object binary architecture: AMD x86-64

This seems to happen as soon as a file begins to get processed.

Also very occasionally, I've spotted these messages:

2024 Nov 28 21:25:17 truenas Process 217787 (tdarr-ffmpeg) of user 568 terminated abnormally without generating a coredump.
2024 Nov 28 21:25:17 truenas Process 217660 (tdarr-ffmpeg) of user 568 terminated abnormally without generating a coredump.
2024 Nov 28 21:25:17 truenas Process 29700 (Tdarr_Node) of user 568 terminated abnormally without generating a coredump.
2024 Nov 28 21:25:20 truenas Process 218478 (tdarr-ffmpeg) of user 568 terminated abnormally without generating a coredump.
2024 Nov 28 21:25:27 truenas Process 219347 (tdarr-ffmpeg) of user 568 dumped core.

@billy-syrett
Copy link

@marshalleq I've found the cause of mine, might be the same thing that's causing yours.

I found by removing this plugin from the flow, the error went away:

Field Value
Source Community
Type checkNodeHardwareEncoder
Version 1.0.0
Name Check Node Hardware Encoder

If you're using that plugin, it may be worth removing it and seeing if you still get the same errors.

@marshalleq
Copy link
Author

Sadly no, I don't use hardware encoding. Perhaps there's another plugin though. Thanks for the suggestion.

@marshalleq
Copy link
Author

Actually, are either of you finding other system instability? I'm pretty convinced this is causing other apps to go haywire. I only just re-enabled tdarr again today and already I have a crash of my VM on Truenas and docker apps have stopped working etc. Second time today. System doesn't shut down properly either.

@billy-syrett
Copy link

I'm actually just one person 😋

Honestly the system feels pretty stable. I only noticed these errors because I had a terminal left on that was ssh'd onto TrueNAS with root and they were coming through on there - I wouldn't have noticed otherwise. All other apps seem fine, system shuts down fine and I don't have any VMs running. I only finished migrating everything to TrueNAS a couple of days ago though so if I notice anything, I'll update here.

Re. the plugins, to find my problematic one, I disabled all libraries except one, reduced the transcoding on the node down to just 1 and edited the flow to add "Require Review" between every step. With a terminal ssh'd onto TrueNAS with root, I then stepped through the flow by clicking "Reviewed" in the Staging Section until the errors appeared. I then checked the report to see what plugin last executed. Bit of a pain bit it did narrow it down.

@marshalleq
Copy link
Author

Nice process! I guess these instability issues are unrelated then. Just started today. Thank goodness for ZFS, I would be screwed without it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants