Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baseline resync msg protocol and data structure #221

Merged
merged 4 commits into from
Nov 6, 2024

Conversation

yuwmao
Copy link
Contributor

@yuwmao yuwmao commented Nov 1, 2024

Based on the baseline resync design doc,

  • Define the schema: ResyncPGMetaData, ResyncShardMetaData, ResyncBlobDataBatch and ResyncMessage.
  • Add some common structures.
  • Separate read and write path:
    PGBlobIterator is the resync context/handler for the leader(read path)
    SnapshotReceiveHandler is the resync context/handler for the follower(write path)

- pg_blob_iterator is the snapshot resync context for leader(read path)
- SnapshotReceiveHandler is a newly added structure as the snapshot context for follower(write path)
- Comment previous implementation codes
@xiaoxichen
Copy link
Collaborator

can we target the baseline resync pr to https://github.com/eBay/HomeObject/tree/baseline_resync?

@yuwmao
Copy link
Contributor Author

yuwmao commented Nov 4, 2024

can we target the baseline resync pr to https://github.com/eBay/HomeObject/tree/baseline_resync?

Sure.

@yuwmao yuwmao changed the base branch from main to baseline_resync November 4, 2024 08:33
@yuwmao yuwmao changed the base branch from baseline_resync to main November 5, 2024 07:22
@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 68.75000% with 5 lines in your changes missing coverage. Please review.

Project coverage is 63.67%. Comparing base (acb04e8) to head (4f33cd3).
Report is 25 commits behind head on main.

Files with missing lines Patch % Lines
src/lib/homestore_backend/pg_blob_iterator.cpp 0.00% 2 Missing ⚠️
src/lib/homestore_backend/replication_message.hpp 84.61% 1 Missing and 1 partial ⚠️
...ib/homestore_backend/replication_state_machine.cpp 0.00% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #221      +/-   ##
==========================================
- Coverage   68.69%   63.67%   -5.02%     
==========================================
  Files          30       31       +1     
  Lines        1581     1682     +101     
  Branches      163      178      +15     
==========================================
- Hits         1086     1071      -15     
- Misses        408      518     +110     
- Partials       87       93       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@yuwmao yuwmao changed the base branch from main to baseline_resync November 5, 2024 09:01
koujl
koujl previously approved these changes Nov 5, 2024
Copy link

@koujl koujl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

snp_batch_id_t batchId;

objId(shard_id_t shard_id, snp_batch_id_t batch_id) : shardId(shard_id), batchId(batch_id) {
//type_bit (1 bit) | shard_id (48 bits) | batch_id (15 bits)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please do overflow checking before taking the 'or' operator.

i.e if (shardid != shardid &0xffffffffffff) // error

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is to prevent overflow batchID /shardID corrupt other field

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, will add checking for them.

user_key : [ubyte];
data : [ubyte];
is_deleted: bool;
data : [ubyte]; // Raw blob data loaded from drive, include BlobHeader, user_key and payload
Copy link
Collaborator

@xiaoxichen xiaoxichen Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where do we want to do checksum in the BR process?
i.e for each batch , how we verify the data? we can choose to use the blob crc in blob header.

Copy link
Contributor Author

@yuwmao yuwmao Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BlobHeader already has a hash, we can use that to verify every blobs in the batches.

uint8_t hash[blob_max_hash_len]{};

@xiaoxichen xiaoxichen merged commit 27fc071 into eBay:baseline_resync Nov 6, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants