Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Possibility to Bootstrap from different node after full crash #98

Open
sebadob opened this issue Jan 1, 2025 · 0 comments
Open
Labels
enhancement New feature or request

Comments

@sebadob
Copy link
Owner

sebadob commented Jan 1, 2025

If we get into the situation where an existing cluster crashes before a new leader could have been elected, and maybe the leader volume got corrupted and cannot be recovered, a cold start after everything works again could get stuck.

Let's say we have nodes 1, 2, 3 and node 1 is the current leader.
If all nodes crash at the exact same time and the volume of node 1 get corrupted so badly, that it cannot be recovered and it happened in the middle of a log replication, so not all nodes are on the exact same log id, then nodes 2 and 3 could get into a situation where they try to re-connect to node 1 (which is dead), because they are lagging behind in log id.
If a situation like this appears, we need a way to force another node to become the new leader, basically ignoring any log id they have not received yet, even though they know that the leader had a higher one just before the crash.

This situaion is super rare, but I have been able to produce it in manual testing, even though it needed a few tries to get into it, even on purpose. However, a solution for something like this should exist.

@sebadob sebadob added the enhancement New feature or request label Jan 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant