-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize chained hyperslab selection. #1031
Conversation
e7cf94a
to
bb47941
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1031 +/- ##
==========================================
+ Coverage 86.78% 86.88% +0.09%
==========================================
Files 101 101
Lines 5964 6008 +44
==========================================
+ Hits 5176 5220 +44
Misses 788 788 ☔ View full report in Codecov by Sentry. |
6b7511b
to
fc616da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Did you test it with libsonata to confirm that the regression is fixed?
Yes, they provided the following reproducer:
The selection results in about 41k slabs, which takes 40s with the performance bug and 0.06 - 0.1s with We also ran their integration tests against the backport of this branch: |
A common pattern for creating semi-unstructured selection is to use many (small) RegularHyperSlab and chain them: ``` HyperSlab hyperslab; for(auto slab : regular_hyper_slabs) { hyperslab |= slab; } ``` This eventually triggers calling: ``` for(auto slab : regular_hyper_slabs) { auto [offset, stride, counts, blocks] = slab; H5Sselect_hyperslab(space_id, offset, stride, counts, block); } ``` Measurements show that this has runtime that's quadratic in the number of regular hyper slabs. This starts becoming prohibitive at 10k - 40k slabs. We noticed that `H5Scombine_select` does not suffer from the same performance issue. This allows us to optimize (long) chain of `Op::Or` using divide and conquer. The current implementation only optimizes streaks of `Op::Or`.
82e9b9a
to
94727b8
Compare
A common pattern for creating semi-unstructured selection is to use many
(small) RegularHyperSlab and chain them:
This eventually triggers calling:
Measurements show that this has runtime that's quadratic in the number
of regular hyper slabs. This starts becoming prohibitive at 10k - 40k
slabs.
We noticed that
H5Scombine_select
does not suffer from the sameperformance issue. This allows us to optimize (long) chain of
Op::Or
using divide and conquer.
The current implementation only optimizes streaks of
Op::Or
.