-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify the recovery when an existing node crashes during an expansion #646
Comments
Currently:
Conditions to allow fast lane startups:
Conditions to disallow fast lane startups:
Things to verify:
Technical aspects The pod name appears in the cassdc .status.nodeStatuses struct with a host ID. |
There can be cases where a scale up operation would be blocked by another crashlooping pod.
In this case the new pod will have the "Starting" label, which prevents the pre-existing pod to come back up after a fix is applied (for example when it's rescheduled on a new worker).
One way we could solve this would be to allow pods to start right away if they host Cassandra nodes that have already bootstrapped in the past, and if they're not part of a replacement.
This way we could have faster startups overall while still protecting ourselves from concurrent bootstraps.
The text was updated successfully, but these errors were encountered: