-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement graceful termination #39
base: main
Are you sure you want to change the base?
Conversation
7b41515
to
ba31cb1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @tsipinakis this actually seems to be a bug and should already be working. We'll have to investigate why it doesn't but this does not seem to be the right approach. The shutdown handlers should take care of closing client connections gracefully.
// GracefulTerminationDeadline is the amount of time ContainerSSH will | ||
// wait after receiving a SIGTERM for all the clients to disconnect. | ||
// After this duraction all connections are forcefully terminated. | ||
GracefulTerminationDeadline time.Duration `json:"gracefulTerminationDeadline" yaml:"gracefulTerminationDeadline" default:"0"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ContainerSSH will not always run on SSH only, we plan a web client too. This deadline should not be tied to SSH.
go s.shutdownHandlers.Shutdown(lifecycle.ShutdownContext()) | ||
go s.disconnectClients(lifecycle, allClientsExited) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this does not work, this is actually a bug. The shutdown handlers should get a chance to run before the clients are disconnected forcefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to work the way it's intended currently: disconnectClients()
waits until the context timeout (hardcoded 30 seconds currently in main) and then force-disconnects all clients. All this PR does is make this delay configurable.
I delay the shutdown handler until all clients are disconnected because running them means shutting down things like the metrics server which may be needed if the delay is too long, we want to know what happened during that time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future it may be beneficial to add a message when a shutdown is triggered prompting all users to re-connect.
@@ -204,12 +204,15 @@ func startPool(pool Service, lifecycle service.Lifecycle) error { | |||
rotateSignals := make(chan os.Signal, 1) | |||
signal.Notify(exitSignals, exitSignalList...) | |||
signal.Notify(rotateSignals, rotateSignalList...) | |||
|
|||
deadline := cfg.SSH.GracefulTerminationDeadline + 5 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 seconds plus is a very arbitrary number, we should come up with something better.
ba31cb1
to
7c11f35
Compare
Please describe the change you are making
Implement support for graceful termination. Currently when upgrading or restarting ContainerSSH all connections are killed instantly kicking out all users. This patch adds an option to delay the termination until all users are disconnected.
This is very useful in kubernetes, with this the pod will stay in 'Terminating' state until all users are disconnected but at the same time will be taken out of the service so no new connections will be routed to it.
Are you the owner of the code you are sending in, or do you have permission of the owner?
Sent with permission of the owner
The code will be published under the MIT-0 license. Have you read and understood this license?
Yes