-
-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should worker process exit when the worker commits seppuku? #501
Comments
What version of worker? |
Can you reproduce the issue using Node rather than bun? |
I haven't tried yet or done much debugging other than confirming that the process is still around. I can try to put together a minimal repro at some point. For clarification, the process is supposed to die after the seppuku message? Thanks for the quick help (and this amazing tool)! |
It doesn’t explicitly exit the process, it just stops running the worker; that should mean everything then shuts down relatively cleanly and then the process should exit. |
I just had a similar issue with version 0.16.5 - I had 2 scheduled tasks configured in Is there any way to ensure that a job failing doesn't prevent it from running at the next scheduled time? |
@JosephHalter can you provide a reproduction? |
It's not 100% clear how to handle this. A force exit would kill all concurrently working jobs which isn't ideal. A graceful shutdown might take forever. Spinning a new worker up to replace the faulty one may cause a thundering herd problem. Here's my first punt at it: |
@benjie Sadly it's pretty hard to reproduce, I only have the logs to show that the tasks failed and seppuku happened but I don't know exactly how it failed. It's also on a private repo that I can't share. Failure in itself can happen and isn't a big issue, the problem was that the job stop begin scheduled entirely - if it was in the system crontab new jobs would have been spawned no matter how the previous ones failed. Indeed it's possible to have herd issues if you keep spawning new jobs but that would have been easier to notice. I've no problem with force killing concurrently working jobs, although like you suggested in the comments of your pull request, going for a graceful exit and only forcefully killing it after a delay would be best, even better if the delay is configurable and if you put 0 then it doesn't forcefully kill the workers for people who don't want the behavior at all. Thank you for taking this issue seriously and immediately creating a pull request 👍 |
Summary
I have a single graphile-worker process that processes a small queue of mostly low priority web scraper stuff with occasional high priority items (emails).
I'm seeing my worker die without the process exiting. There's a supervisor that is supposed to restart if it fails, but the process never exits so the supervisor doesn't get triggered. I can program around this by parsing the log output (e.g. look for "seppuku" and restart), but I'm guessing I'm just missing something in my setup. Is there a setting that will make the process exit when it commits seppuku?
Additional context
Node v20.15.1
Here's the script I run on the server to start the worker:
bunx --bun graphile-worker --cleanup DELETE_PERMAFAILED_JOBS,GC_TASK_IDENTIFIERS,GC_JOB_QUEUES && bunx --bun graphile-worker --no-prepared-statements
Here's an example of the error I see when it stops processing without exiting the process:
The text was updated successfully, but these errors were encountered: