Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SealShard state change in pre-commit #99

Closed
xiaoxichen opened this issue Oct 30, 2023 · 6 comments · Fixed by #120
Closed

SealShard state change in pre-commit #99

xiaoxichen opened this issue Oct 30, 2023 · 6 comments · Fixed by #120
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@xiaoxichen
Copy link
Collaborator

Now we update shard state in commit, which introduce racing beween sealshard vs putBlob.

Marking shard as sealed in pre-commit , failing following putBlob(whose LSN >X) , and do roll_back if consensus cannot reach

@szmyd szmyd added this to the MileStone3.11 milestone Nov 2, 2023
@JacksonYao287 JacksonYao287 linked a pull request Dec 3, 2023 that will close this issue
@hkadayam hkadayam modified the milestones: MileStone4.1, MileStone4.2 Jan 31, 2024
@JacksonYao287 JacksonYao287 added the enhancement New feature or request label Mar 15, 2024
@JacksonYao287
Copy link
Collaborator

I revisit the idea here and have a new thought.

if we rollback seal_shard and change the shard status from sealed to open in some corner case , who can seal this shard again?

when dm open a shard , it will store the shard info in the kv store of cm. when it closes the shard, it will remove that kv item from cm. if dm send seal_shard message to sm leader, and sm leader will pre_commit the message, which will change the status of the shard to sealed , and send a success response to dm before it sends the log entry to followers. if dm receives the ok response, it will remove the kv item from cm and konw nothing about that shard. if the message is rollbacked in some corner case , the shard will be changed from sealed to open, and it will be an orphan open shard since nobody knows it is in open state, and thus nobody will try to seal it.

do we need a separate scrubber to find all the orphan open shards and seal them?

@xiaoxichen @szmyd

@xiaoxichen
Copy link
Collaborator Author

I am not sure if I understand it correctly. The rollback will be called only when the quorum cannot reach and decide to rollback this commit . In this case, why client(DM ) will get a success response?

@szmyd
Copy link
Collaborator

szmyd commented Mar 15, 2024

I'm trying to remember this issue and reconcile with the changes we've made recently. So the problem is that a Blob can be received on the data path referencing a Shard that has been sealed already right? This could happen because of ordering issues between the DataPath and RaftPath or when a member is resyncing and receiving Blobs and log entries out-of-order?

Obviously we need to receive all Blob data over the data path prior to processing any SealShard in the log the blobs reference, or at the very least before putting it back in the heap allocator...

@JacksonYao287
Copy link
Collaborator

JacksonYao287 commented Mar 18, 2024

@xiaoxichen
according to the design and workflow of nuraft , in precommit case , it will response to client after appending log but before sending to the followers. it means precommit will response to client before reaching quorum consensus and that is why precommit will achieve low latency by sacrificing consistency and resolving conflicts manually

@xiaoxichen
Copy link
Collaborator Author

@xiaoxichen according to the design and workflow of nuraft , in precommit case , it will response to client after appending log but before sending to the followers. it means precommit will response to client before reaching quorum consensus and that is why precommit will achieve low latency by sacrificing consistency and resolving conflicts manually

are we using async mode? I think we are using regular flow
https://github.com/eBay/NuRaft/blob/master/docs/replication_flow.md

@JacksonYao287
Copy link
Collaborator

@xiaoxichen
yes, you are right. then, the problem does not exist, we can go ahead. thx!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants