-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SDSTOR-11601 add create_shard implementation #34
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
ea7f37d
to
1272ed1
Compare
9945e64
to
d933d65
Compare
9d064e8
to
b17a28d
Compare
b17a28d
to
60c9f69
Compare
60c9f69
to
7b17b72
Compare
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we miss the implement on_precommit
and async_alloc_write
for now, and will be implemented later?
src/lib/homestore/shard_manager.cpp
Outdated
pg_iter->second.shards.push_back(shard_info); | ||
|
||
//following part is must for follower members or when member is restarted; | ||
auto sequence_num = get_sequence_num_from_shard_id(shard_info->id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the follower apply all the logs correctly, will this happen? or it is just a way to keep safe here
another question, should we flush the pg_sequence_number to metadataservice or somewhere else to make sure will will get the latest sequence_number when restarting, or just set the largest sequence_num(get from shardid)of all the recovered shardid as the latest sequence_number of this pg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1.follower will happens, shard is generated from the leader side and sequence num s increased only in leader side, this gives the chance for follower to catch up sequence num with leader.
2. it is better to flush pg_sequence_number to disk but it will cause extra IO and I am trying to avoid it by recovering the lastest sequence num from meta blks, as all commited shard creation is already persisted in meta blks. there are still another two options for this from my side: using raft LSN or combined with (raft term , local sequence num) to be new proposal shard id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i prefer lastest sequence num from meta blks, that is the source of truth
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is the current implementation behaviors.
7b17b72
to
0af7f2e
Compare
ce58fe9
to
61ec156
Compare
61ec156
to
f87d68f
Compare
This comment was marked as outdated.
This comment was marked as outdated.
yes, homestore version ShardManager unit test is added in the latest commit. |
auto iter = _flying_shards.find(lsn); | ||
if (iter != _flying_shards.end()) { | ||
_flying_shards.erase(iter); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here, should we delete the block written in on_precommit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think. I think the written shard header will be used for recovery in case of meta blks are lost from the previous discussion meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we rollback the pre_commit
, this means this is a canceled create_shard
, why to recovery this?
coming to this, i have another question. when rolling back a pre_commit
, seems nuraft does not append a log for rollback, what the statemachine does is just call the rollback
function(i just only read the nuraft doc, but not deep dive the code for now, so if this is wrong, please correct me). so , if it crashes after we called pre_commit
but before or not finish calling statemachie#rollback
, who will take the task to finish the rollback job after restart
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- yes, it is absolutely more clear to truncate/overwrite the shard header which is already written in on_pre_commit(), but homestore DataService is append-only model, I think overwritten is conflict with this model. even this shard header is leaked, I think it will be finalized recycled by chunk compaction, right? because this log is rollbacked and will never be commited, so from the homeobject side, it can not see this shard any more(whether now or after recovery).
- I think your mentioned case may happens, restart happens after on_pre_commit() but before on_rollback(), in such case, this shard header is leaked, but same to question 1, it will be recycled finalized.
This comment was marked as outdated.
This comment was marked as outdated.
sure, I do not realize it before and will take cake of this from now. |
Codecov ReportPatch coverage:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #34 +/- ##
==========================================
+ Coverage 81.04% 81.44% +0.40%
==========================================
Files 16 18 +2
Lines 248 469 +221
Branches 26 48 +22
==========================================
+ Hits 201 382 +181
- Misses 31 57 +26
- Partials 16 30 +14
☔ View full report in Codecov by Sentry. |
Nice; I think once we get the CodeCoverage up above 80% we can consider merging this. Let me know if any help is needed to get those unit tests made. |
thanks, Brian. I had added some more unit tests to improve code coverage to 81.99% with the latest few commits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's merge this and make fixes/api changes later. 👍
src/lib/homestore/shard_manager.cpp
Outdated
if (!replica_set->is_leader()) { | ||
LOGWARN("pg [{}] replica set is not a leader, please retry other pg members", pg_owner); | ||
return folly::makeUnexpected(ShardError::NOT_LEADER); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh; shard creation should be able to happen everywhere. We can relax this in M3 I think when we have real replicas. Ideally yes it happens on leader, but it's "best-effort" on a follower (shard id may be invalid).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to follow up; this has to be the case since by the time we actually call append_entries
in HomeStore the leadership has changed since it was checked. We can never pre-check leadership...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe let HS return an error in this case? Same situation is also true for write_blob request right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes; if we want to prevent this we can just turn forwarding
off in the raft_server; it will reject append_entries
if it's not currently the leader. But I'm not sure why we care about leadership here...we're basically just doing what the DM would retry anyways by forwarding it.
We stick who the leader was in the response in case the DM wants to adjust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, if raft server on follower side support msg forward, then this check will not be necessary. before I do not know this feature is enabled in nuraft.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK yes, only for the optimize_path. DM and GW need to adjust as well.
Actually I was kind of not sure why OM/CM team need that strict regarding sending to leader :
this is the draft version of create_shard implementation in homeobject and can not build successfully until ReplDev in homestore is finalized.