v2.1: Refactors get_snapshot_storages() (backport of #3760) #3785

mergify · 2024-11-25T21:18:24Z

Problem

AccountsDb::get_snapshot_storages() is due for a refactor. We can speed it up for the common case, and also simplify.

Since #3737, we now call get_snapshot_storages() every time we clean. The observation here is we'll only need about 100 storages (on average), yet the current impl of get_snapshot_storages() Arc::clone's all the storages, and then filters out the unneeded ones. We can change this to only Arc::clone the useful ones instead.

Additionally, the filter step is done in parallel. When we only have 100 storages, the parallel execution does not help. In fact, with a chunk size of 5000, we end up getting zero benefit, but do have to pay the cost of running in the thread pool.

Summary of Changes

When getting storages, only Arc::clone the ones we need
When filtering, do not use a thread pool

Results

I ran this on against mnb and saw good results.

Since get_snapshot_storages() is called in two places, I wanted to look at the perf results in both.

clean

Here, we only need ~100 storages each time. So not Arc::cloning and not using the thread pool really helps. The PR ends up running consistently, and meaningfully, faster:

branch	get storages	filter	total
master	30-50 ms	100-400 us	30-50 ms
pr v1	11-13 ms	10-30 us	11-13 ms
pr v2	n/a	n/a	7-9 ms
pr v3	n/a	n/a	6-8 ms

taking full snapshots

Full snapshots do need all the storages, so we will end up Arc::cloning almost all the storages. And it's possible the thread pool does help here. For the most part, runtimes are pretty similar. The PR does have a worse worst-case filter time.

branch	get storages	filter	total
master	30-50 ms	30-40 ms	60-90 ms
pr v1	30-50 ms	30-50 ms	60-100 ms
pr v2	n/a	n/a	40-60 ms
pr v3	n/a	n/a	40-60 ms

Overall, I think the common case of clean makes this change clearly a win. There is maybe a slight slow down for full snapshots, but that code is both infrequent, and in the background, so I don't think it matters much. Additionally, by not using a thread pool, we may reduce resource usage for the system as a whole.

Justification to Backport

As per the Problem section, we now call get_snapshot_storages() during clean. Until the skipping rewrites feature is enabled, that means we'll be doing this extra work. The feature is in v2.1, and so mnb on v2.0 and v2.1 will be paying the extra cost. Thus, improving the performance for get_snapshot_storages() benefits the validator as a whole.

In theory we could also backport to v2.0 as well, but I'm not currently advocating for that yet.

This is an automatic backport of pull request #3760 done by [Mergify](https://mergify.com).

brooksprumo · 2024-11-26T15:06:52Z

Blocked until the version bump pr (#3782) is merged.

brooksprumo · 2024-11-26T17:06:49Z

@Mergifyio rebase

mergify · 2024-11-26T17:06:58Z

rebase

❌ Base branch update has failed

brooksprumo token is invalid, make sure brooksprumo can still log in on the Mergify dashboard.

brooksprumo · 2024-11-26T17:07:34Z

@Mergifyio rebase

(cherry picked from commit 8c7ae80)

mergify · 2024-11-26T17:08:08Z

rebase

✅ Branch has been successfully rebased

brooksprumo · 2024-11-26T17:48:48Z

Blocked until ~~#3791~~ #3800 is merged, to addess the advisory.

mergify bot requested a review from a team as a code owner November 25, 2024 21:18

mergify bot assigned brooksprumo Nov 25, 2024

Refactors get_snapshot_storages() (#3760)

7d7f7af

(cherry picked from commit 8c7ae80)

brooksprumo force-pushed the mergify/bp/v2.1/pr-3760 branch from 3976744 to 7d7f7af Compare November 26, 2024 17:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.1: Refactors get_snapshot_storages() (backport of #3760) #3785

v2.1: Refactors get_snapshot_storages() (backport of #3760) #3785

mergify bot commented Nov 25, 2024 •

edited by brooksprumo

Loading

brooksprumo commented Nov 26, 2024

brooksprumo commented Nov 26, 2024

mergify bot commented Nov 26, 2024

brooksprumo commented Nov 26, 2024

mergify bot commented Nov 26, 2024

brooksprumo commented Nov 26, 2024 •

edited

Loading

v2.1: Refactors get_snapshot_storages() (backport of #3760) #3785

Are you sure you want to change the base?

v2.1: Refactors get_snapshot_storages() (backport of #3760) #3785

Conversation

mergify bot commented Nov 25, 2024 • edited by brooksprumo Loading

Problem

Summary of Changes

Results

Justification to Backport

brooksprumo commented Nov 26, 2024

brooksprumo commented Nov 26, 2024

mergify bot commented Nov 26, 2024

❌ Base branch update has failed

brooksprumo commented Nov 26, 2024

mergify bot commented Nov 26, 2024

✅ Branch has been successfully rebased

brooksprumo commented Nov 26, 2024 • edited Loading

mergify bot commented Nov 25, 2024 •

edited by brooksprumo

Loading

brooksprumo commented Nov 26, 2024 •

edited

Loading