-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify rpcq:start-server
to support graceful shutdown of worker threads
#75
Comments
See also ecpeterson's comment regarding possibly trickiness of handling the |
Merged
appleby
added a commit
that referenced
this issue
Jan 14, 2020
Add a (SLEEP 1) after every BT:DESTROY-THREAD call. This is a work-around that enables switching over to EC2 runners for the gitlab CI. Something about the EC2 environment was tickling a race in our combination of using BT:DESTROY-THREAD + our reliance on PZMQ's implicit global *DEFAULT-CONTEXT*. Previously, each test called BT:DESTROY-THREAD to terminate it's SERVER-THREAD, and when that sever-thread unwinds, it would terminate each of of it's worker threads, then terminate the (implicit global) PZQM:*DEFAULT-CONTEXT*. If teardown of the PZMQ/ZMQ context has not fully completed by the time the next test runs, we might be in a situation where PQZM:*DEFAULT-CONTEXT* is non-NIL, but in some zombie half-dead state, yet the next test case will attempt to re-use it. A proper fix for this likely entails some combination of 1) Explicitly creating our own PZMQ:CONTEXT in RPCQ:START-SERVER and passing that down to worker threads. 2) Adding a graceful shutdown option to the RPCQ server to avoid the need to BT:DESTROY-THREAD (see #75).
appleby
added a commit
that referenced
this issue
Jan 14, 2020
Add a (SLEEP 1) after every BT:DESTROY-THREAD call. This is a work-around that enables switching over to EC2 runners for the gitlab CI. Something about the EC2 environment was tickling a race in our combination of using BT:DESTROY-THREAD + our reliance on PZMQ's implicit global *DEFAULT-CONTEXT*. Previously, each test called BT:DESTROY-THREAD to terminate it's SERVER-THREAD, and when that sever-thread unwinds, it would terminate each of of it's worker threads, then terminate the (implicit global) PZQM:*DEFAULT-CONTEXT*. If teardown of the PZMQ/ZMQ context has not fully completed by the time the next test runs, we might be in a situation where PQZM:*DEFAULT-CONTEXT* is non-NIL, but in some zombie half-dead state, yet the next test case will attempt to re-use it. A proper fix for this likely entails some combination of 1) Explicitly creating our own PZMQ:CONTEXT in RPCQ:START-SERVER and passing that down to worker threads. 2) Adding a graceful shutdown option to the RPCQ server to avoid the need to BT:DESTROY-THREAD (see #75).
appleby
added a commit
that referenced
this issue
Jan 14, 2020
Add a (SLEEP 1) after every BT:DESTROY-THREAD call. This is a work-around that enables switching over to EC2 runners for the gitlab CI. Something about the EC2 environment was tickling a race in our combination of using BT:DESTROY-THREAD + our reliance on PZMQ's implicit global *DEFAULT-CONTEXT*. A proper fix for this likely entails some combination of 1) Explicitly creating our own PZMQ:CONTEXT in RPCQ:START-SERVER and passing that down to worker threads. 2) Adding a graceful shutdown option to the RPCQ server to avoid the need to BT:DESTROY-THREAD (see #75).
Merged
Don't forget to cleanup the rpcq tests and replace the calls there to See @jmbr's comment here: #113 (comment) |
stylewarning
pushed a commit
that referenced
this issue
Jan 14, 2020
Add a (SLEEP 1) after every BT:DESTROY-THREAD call. This is a work-around that enables switching over to EC2 runners for the gitlab CI. Something about the EC2 environment was tickling a race in our combination of using BT:DESTROY-THREAD + our reliance on PZMQ's implicit global *DEFAULT-CONTEXT*. A proper fix for this likely entails some combination of 1) Explicitly creating our own PZMQ:CONTEXT in RPCQ:START-SERVER and passing that down to worker threads. 2) Adding a graceful shutdown option to the RPCQ server to avoid the need to BT:DESTROY-THREAD (see #75).
Also don't forget to make the corresponding change to remove |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
As discussed in the comment thread for PR #74, it's worth investigating whether
rcpq:start-server
can be modified to support graceful shutdown of worker threads, rather than relying onbt:destory-thread
for testing.The reason for suggesting this change is that just calling
bt:destroy-thread
was causing test failures on CCL. Those failures were addressed by the changes in #74, butbt:destroy-thread
is something of a nuclear option for killing threads, and the bordeaux-threads documentation fordestroy-threads
warns:So we'd like to do graceful termination of server threads, if possible.
This work might dovetail nicely with the non-blocking
start-server
changes suggested in #61.The text was updated successfully, but these errors were encountered: