-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add baseline resync read part in homeobject. #204
Conversation
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #204 +/- ##
==========================================
- Coverage 68.69% 66.76% -1.93%
==========================================
Files 30 32 +2
Lines 1581 1724 +143
Branches 163 185 +22
==========================================
+ Hits 1086 1151 +65
- Misses 408 480 +72
- Partials 87 93 +6 ☔ View full report in Codecov by Sentry. |
e95c9f5
to
f757100
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except a bug for reader size. writer side code is ignored.
88b6b7c
to
da98f06
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the late review @sanebay , pls take a look at my comments and correct if i am wrong
auto& index_results_vec = r.value(); | ||
for (auto& info : index_results_vec) { | ||
if (info.pbas == HSHomeObject::tombstone_pbas) { | ||
// Skip deleted blobs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure can we do it like this?
suppose leader has {1,1} to {5,5}, and {5,7} to {10,10}({shard_id, blob_id}) , last lsn is 120, {5,6} is deleted at lsn 100. the log at leader has been compacted to 110.
follower has {1,1} to {6.0} and the last lsn is 80 , then if baseline resync occurs, the follower will never konw that {5,6} has been deleted since it is not aware fo lsn 100.
so I think here we should set some special data for the tombstone_pbas in the blob_info_vec which will be send to follower , so that the follower can identify that this blob is deleted.
another question, if GC happens and tombstone is also removed, so how can leader let the follower know this info when baseline resync happens.
pls correct me if i missunderstand anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We havent put LSN into each blob index so ATM it is a full resync --- i.e all existing data can be discard. So there is no issue for deleted blob and no necessary for transferring tombstone.
Extending the discussion further , assuming we have LSN in blob index, we can let follower to set its current LSN and leader will only send the [follower_lsn , snapshot_lsn] to follower. In this case, as you said , we care about blob deletion. The trivial approach is leader send out active blob list in <shard_id =S , batch =0 > , follower mark all blobs not in active blob list as deleted.
I think we are not yet have solid thinking regarding the "incremental snapshot" especially with a good amount of reserved log entries. Though personally I am loving it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is a valid scenario. We can do two ways.
- Either follower see's a gap in blob-sequence and assume its deletion.
- More safe approach is use scrubber. In scrubber leader sends the valid list, its crc across followers. Follower can use this to delete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think before we have the incremental snapshot
feature , i.e only transfer diff between 2 snapshots, we would better erase everything on receiver side as anyway we start from scratch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what we will do in baseline resync write part is incremental snapshot , no?
as I mentioned here
#204 (comment)
after follower receives pg and shard metadata in the first obj, it will ask for shard and blobs that does not exist in this follower.
Either follower see's a gap in blob-sequence and assume its deletion
this does not work, for example, leader has {1, 10} to {3,10} and follower has {1,10} to {2,5}. if {1,5} is deleted at leader , then follower can not get this blob-sequence gap since it will start syncing shards and blobs from shard 2
More safe approach is use scrubber. In scrubber leader sends the valid list, its crc across followers. Follower can use this to delete.
this seems works. also we should sends open shard list, since some seal shard log might also be compacted
da98f06
to
7ffb55d
Compare
the code here LGTM. there are two left questions: 2 if shard is sealed and the seal shard log is compacted, when baseline resync occurs , how to make follower aware of the seal of this shard? they are essentially the same question. let`s think a bit more and discuss it in homestore meeting if necessary. |
Shard seal should work, because either we get latest metadata with shard seal or we see a log entry for shard seal. Only blob's if they are GC-ed, they get orphan on the followers. Scrubber is the right way to solve this. |
7ffb55d
to
a54d4bd
Compare
Add read_snapshot_data to go over all shards and blobs of a PG. If obj_id is zero, send all shards. obj_id is concatenation of blob sequence number and batch number. For all other values of obj_id, we send batch of blobs for a shard. Once all blobs are finished in a shard, we move to next shard_id, and batch_num is reset to 0. Add LSN in shard metadata to ignore all reads of shards which are created later that the snapshot LSN.
a54d4bd
to
7fe21e8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG!
Add read_snapshot_data to go over all shards and blobs of a PG. If obj_id is zero, send all shards. obj_id is concatenation of blob sequence number and batch number. For all other values of obj_id, we send batch of blobs for a shard. Once all blobs are finished in a shard, we move to next shard_id, and batch_num is reset to 0. Add LSN in shard metadata to ignore all reads of shards which are created later that the snapshot LSN.
Added temporary code to write blobs and metadata, tested with SM long running test to create a baseline resync with follower.
Tested with UT.