-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: introduce CasManager to support chunk dedup at runtime #1626
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Desiki-high
requested review from
imeoer,
hsiangkao and
power-more
and removed request for
a team
September 21, 2024 07:52
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1626 +/- ##
==========================================
+ Coverage 60.43% 60.51% +0.07%
==========================================
Files 146 146
Lines 48841 49178 +337
Branches 46322 46659 +337
==========================================
+ Hits 29517 29760 +243
- Misses 17558 17636 +78
- Partials 1766 1782 +16
|
Desiki-high
force-pushed
the
storage/copy-range
branch
9 times, most recently
from
September 27, 2024 10:15
b386bde
to
f360c3b
Compare
Desiki-high
force-pushed
the
storage/copy-range
branch
4 times, most recently
from
October 1, 2024 06:09
60478a0
to
a9b8fe4
Compare
Desiki-high
force-pushed
the
storage/copy-range
branch
9 times, most recently
from
October 16, 2024 11:50
b2f8cfb
to
64a27ce
Compare
Add helper copy_file_range() which: - avoid copy data into userspace - may support reflink on xfs etc Signed-off-by: Jiang Liu <[email protected]>
- improve copy_file_range when target os is not linux - add more comprehensive tests Signed-off-by: Yadong Ding <[email protected]>
Implement CasManager to support chunk dedup at runtime. The manager provides to major interfaces: - add chunk data to the CAS database - check whether a chunk exists in CAS database and copy it to blob file by copy_file_range() if the chunk exists. Signed-off-by: Jiang Liu <[email protected]>
- Changed `delete_blobs` method in `CasDb` to take an immutable reference (`&self`) instead of a mutable reference (`&mut self`). - Updated `dedup_chunk` method in `CasMgr` to correctly handle the deletion of non-existent blob files from both the file descriptor cache and the database. - Implemented the `gc` (garbage collection) method in `CasMgr` to identify and remove blobs that no longer exist on the filesystem, ensuring the database and cache remain consistent. Signed-off-by: Yadong Ding <[email protected]>
Enable chunk deduplication for file cache. It works in this way: - When a chunk is not in blob cache file yet, inquery CAS database whether other blob data files have the required chunk. If there's duplicated data chunk in other data files, copy the chunk data into current blob cache file by using copy_file_range(). - After downloading a data chunk from remote, save file/offset/chunk-id into CAS database, so it can be reused later. Co-authored-by: Jiang Liu <[email protected]> Co-authored-by: Yading Ding <[email protected]> Signed-off-by: Yadong Ding <[email protected]>
Add documentation for cas. Signed-off-by: Jiang Liu <[email protected]>
Desiki-high
force-pushed
the
storage/copy-range
branch
from
October 17, 2024 01:46
64a27ce
to
8a90e49
Compare
imeoer
reviewed
Oct 23, 2024
Desiki-high
force-pushed
the
storage/copy-range
branch
from
October 23, 2024 02:46
8a90e49
to
09a874a
Compare
Add smoking test case for cas and chunk dedup. Signed-off-by: Yadong Ding <[email protected]>
Desiki-high
force-pushed
the
storage/copy-range
branch
from
October 23, 2024 02:54
09a874a
to
f6719a2
Compare
imeoer
approved these changes
Oct 23, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
6 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Relevant Issue (if applicable)
If there are Issues related to this PullRequest, please list it.
Details
Base #1507, complete implementation and testing.
Types of changes
What types of changes does your PullRequest introduce? Put an
x
in all the boxes that apply:Checklist
Go over all the following points, and put an
x
in all the boxes that apply.