Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add zfree_with_size to optimize sdsfree since we can get zmalloc_size from the header #453

Merged
merged 11 commits into from
Jun 19, 2024

Conversation

poiuj
Copy link
Contributor

@poiuj poiuj commented May 7, 2024

Description

zfree updates memory statistics. It gets the size of the buffer from jemalloc by calling zmalloc_size. This operation is costly. We can avoid it if we know the buffer size. For example, we can calculate size of sds from the data we have in its header.

This commit introduces zfree_with_size function that accepts both pointer to a buffer, and its size. zfree is refactored to call zfree_with_size.

sdsfree uses the new interface for all but SDS_TYPE_5.

Benchmark

Dataset is 3 million strings. Each benchmark run uses its own value size (8192, 512, and 120). The benchmark is 100% write load for 5 minutes.

value size       new tps      old tps      %       new us/call    old us/call    %
8k               272088.53    269971.75    0.78    1.83           1.92           -4.69
512              356881.91    352856.72    1.14    1.27           1.35           -5.93
120              377523.81    368774.78    2.37    1.14           1.19           -4.20

Copy link

codecov bot commented May 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.05%. Comparing base (4176604) to head (1a3a6d6).
Report is 37 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable     #453      +/-   ##
============================================
- Coverage     70.07%   70.05%   -0.03%     
============================================
  Files           110      110              
  Lines         59989    60085      +96     
============================================
+ Hits          42037    42092      +55     
- Misses        17952    17993      +41     
Files Coverage Δ
src/sds.c 85.86% <100.00%> (-0.13%) ⬇️
src/zmalloc.c 85.05% <100.00%> (+0.41%) ⬆️

... and 37 files with indirect coverage changes

@lipzhu
Copy link
Contributor

lipzhu commented May 8, 2024

I think the zmalloc_size is really expensive to update the used_memory, we had a similar discussion in #308 (comment), for this patch, I'm curious if the assertion of sds->alloc == allocate size in all cases, what happened if call the sdsResize ?

@zvi-code
Copy link
Contributor

zvi-code commented May 8, 2024

@lipzhu the update of used_memory has 2 cost aspects, the zmalloc_size and also the fact it's an atomic operation.

  1. regarding zmalloc_size cost: it is actually called twice, once in zmalloc code and enother one inside the je_free because the tcache needs to know what bin the buffer belongs to. So, if you remove the zmalloc_size call for stat, you are still paying the cost. When using je_sdallocx to free (this PR) it uses, internally in jemalloc, fast_path that avoids fetching the size to begin with. BTW: I think for cpp delete operator is actually overloaded in jemalloc.cpp to use sized delete.
  2. [edit I see you already have PR for this] regarding atomic cost: this I believe can be also eliminated with use of thread local vars(I guess the team has considered this before)

@poiuj
Copy link
Contributor Author

poiuj commented May 8, 2024

@lipzhu good point about sds->alloc == allocate size. I was sure it is. But now I see that sds->alloc is truncated when the sds len is close enough to sds type max len (2^8, 2^16, etc.). I'll update the PR with a fix.

src/sds.c Outdated Show resolved Hide resolved
src/sds.c Outdated Show resolved Hide resolved
madolson pushed a commit that referenced this pull request May 16, 2024
…to align the logic with sdsnewlen function. (#476)

This patch try to correct the actual allocated size from allocator
when call sdsRedize to align the logic with sdsnewlen function.

Maybe the #453 optimization
should depend on this.

Signed-off-by: Lipeng Zhu <[email protected]>
@madolson madolson requested a review from ranshid May 16, 2024 16:53
srgsanky pushed a commit to srgsanky/valkey that referenced this pull request May 19, 2024
…to align the logic with sdsnewlen function. (valkey-io#476)

This patch try to correct the actual allocated size from allocator
when call sdsRedize to align the logic with sdsnewlen function.

Maybe the valkey-io#453 optimization
should depend on this.

Signed-off-by: Lipeng Zhu <[email protected]>
adetunjii pushed a commit to adetunjii/valkey that referenced this pull request May 22, 2024
…to align the logic with sdsnewlen function. (valkey-io#476)

This patch try to correct the actual allocated size from allocator
when call sdsRedize to align the logic with sdsnewlen function.

Maybe the valkey-io#453 optimization
should depend on this.

Signed-off-by: Lipeng Zhu <[email protected]>
Signed-off-by: Samuel Adetunji <[email protected]>
poiuj added 3 commits June 3, 2024 15:14
zfree updates memory statistics. It gets the size of the buffer from
jemalloc by calling zmalloc_size. This operation is costly. We can
avoid it if we know the buffer size. For example, we can calculate
size of sds from the data we have in its header.

This commit introduces zfree_with_size function that accepts both
pointer to a buffer, and its size. zfree is refactored to call
zfree_with_size.

sdsfree uses the new interface for all but SDS_TYPE_5.

Signed-off-by: Vadym Khoptynets <[email protected]>
sdsalloc returns sds's length for SDS_TYPE_5. It's not correct for
SDS_TYPE_5.

This patch makes sdsAllocSize call zmalloc_size for
SDS_TYPE_5. sdsalloc is a lower level function that continues to
return length for SDS_TYPE_5.

Signed-off-by: Vadym Khoptynets <[email protected]>
Instead of zmalloc_size use s_malloc_size.

Signed-off-by: Vadym Khoptynets <[email protected]>
src/sds.c Show resolved Hide resolved
src/zmalloc.c Outdated Show resolved Hide resolved
src/zmalloc.c Outdated Show resolved Hide resolved
src/zmalloc.c Outdated Show resolved Hide resolved
@poiuj poiuj requested a review from ranshid June 11, 2024 08:46
Change the doc string for `zfree_with_size`

Signed-off-by: Vadym Khoptynets <[email protected]>
Signed-off-by: Vadym Khoptynets <[email protected]>
src/Makefile Outdated Show resolved Hide resolved
src/sds.c Show resolved Hide resolved
src/zmalloc.c Outdated Show resolved Hide resolved
src/zmalloc.c Outdated Show resolved Hide resolved
@poiuj poiuj requested a review from ranshid June 13, 2024 14:52
Copy link
Member

@ranshid ranshid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@madolson
Copy link
Member

Ok, I wasn't able to get as consistent of performance numbers. Diving into a perf analysis shows about a 20% reduction in CPU in sdsfree though. So it seems like there might be other bottlenecks, this is definitely a good optimization.

src/zmalloc.c Outdated Show resolved Hide resolved
Copy link
Member

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks good to me, I have a question about an alternative, but it's not a blocker from me. If you still want to go with this approach, I think it's OK to merge.

@madolson madolson added performance release-notes This issue should get a line item in the release notes labels Jun 18, 2024
@poiuj
Copy link
Contributor Author

poiuj commented Jun 18, 2024

Yes, let's go with this approach.

@enjoy-binbin enjoy-binbin changed the title Add zfree_with_size Add zfree_with_size to optimize sdsfree since we can get zmalloc_size from the header Jun 19, 2024
Copy link
Member

@enjoy-binbin enjoy-binbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, it is a good optimization.

src/zmalloc.c Outdated Show resolved Hide resolved
Signed-off-by: Madelyn Olson <[email protected]>
@madolson madolson merged commit 0143b7c into valkey-io:unstable Jun 19, 2024
19 checks passed
madolson pushed a commit that referenced this pull request Jun 25, 2024
Issue Introduced  by #453. 
When we check the SDS _TYPE_5 allocation size we mistakenly used
zmalloc_size which DOES take the PREFIX size into account when no
malloc_size support.
Later when we free we add the PREFIX_SIZE again which leads to negative
memory accounting on some tests.
Example test failure:
https://github.com/valkey-io/valkey/actions/runs/9654170962/job/26627901497

Signed-off-by: ranshid <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance release-notes This issue should get a line item in the release notes
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

7 participants