Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize chained hyperslab selection. #1031

Merged
merged 2 commits into from
Jul 26, 2024
Merged

Conversation

1uc
Copy link
Collaborator

@1uc 1uc commented Jul 25, 2024

A common pattern for creating semi-unstructured selection is to use many
(small) RegularHyperSlab and chain them:

HyperSlab hyperslab;
for(auto slab : regular_hyper_slabs) {
  hyperslab |= slab;
}

This eventually triggers calling:

for(auto slab : regular_hyper_slabs) {
  auto [offset, stride, counts, blocks] = slab;
  H5Sselect_hyperslab(space_id, offset, stride, counts, block);
}

Measurements show that this has runtime that's quadratic in the number
of regular hyper slabs. This starts becoming prohibitive at 10k - 40k
slabs.

We noticed that H5Scombine_select does not suffer from the same
performance issue. This allows us to optimize (long) chain of Op::Or
using divide and conquer.

The current implementation only optimizes streaks of Op::Or.

@1uc 1uc force-pushed the 1uc/optimize-hyperslab-selection branch 2 times, most recently from e7cf94a to bb47941 Compare July 25, 2024 14:45
Copy link

codecov bot commented Jul 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.88%. Comparing base (8145c27) to head (94727b8).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1031      +/-   ##
==========================================
+ Coverage   86.78%   86.88%   +0.09%     
==========================================
  Files         101      101              
  Lines        5964     6008      +44     
==========================================
+ Hits         5176     5220      +44     
  Misses        788      788              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@1uc 1uc marked this pull request as ready for review July 25, 2024 15:30
@1uc 1uc force-pushed the 1uc/optimize-hyperslab-selection branch from 6b7511b to fc616da Compare July 25, 2024 15:37
Copy link

@jorblancoa jorblancoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
Did you test it with libsonata to confirm that the regression is fixed?

@1uc
Copy link
Collaborator Author

1uc commented Jul 26, 2024

Yes, they provided the following reproducer:

import libsonata
import numpy as np
import time

np.random.seed(42)

sto = libsonata.NodeStorage('sscx-nodes-sonata.h5')
pop = sto.open_population('All')

#ids = np.arange(0, 100000, 2)
count = int(0.01*pop.size)
# count = 100
print(f'selecting {count} from {pop.size}')
ids = np.random.randint(0, pop.size, count)

t1 = time.perf_counter()
sel = libsonata.Selection(ids)
t2 = time.perf_counter()
print(f"elapsed = {t2 - t1}")

print(np.mean(pop.get_attribute('x', sel)))

The selection results in about 41k slabs, which takes 40s with the performance bug and 0.06 - 0.1s with libsonata@master and highfive@1uc/backport-optimize-hyperslab-selection. It takes 0.1 - 0.16s for [email protected].

We also ran their integration tests against the backport of this branch:
https://github.com/BlueBrain/HighFive-testing/actions/runs/10108668331

1uc added 2 commits July 26, 2024 13:11
A common pattern for creating semi-unstructured selection is to use many
(small) RegularHyperSlab and chain them:

```
HyperSlab hyperslab;
for(auto slab : regular_hyper_slabs) {
  hyperslab |= slab;
}
```

This eventually triggers calling:
```
for(auto slab : regular_hyper_slabs) {
  auto [offset, stride, counts, blocks] = slab;
  H5Sselect_hyperslab(space_id, offset, stride, counts, block);
}
```

Measurements show that this has runtime that's quadratic in the number
of regular hyper slabs. This starts becoming prohibitive at 10k - 40k
slabs.

We noticed that `H5Scombine_select` does not suffer from the same
performance issue. This allows us to optimize (long) chain of `Op::Or`
using divide and conquer.

The current implementation only optimizes streaks of `Op::Or`.
@1uc 1uc force-pushed the 1uc/optimize-hyperslab-selection branch from 82e9b9a to 94727b8 Compare July 26, 2024 11:11
@1uc 1uc merged commit e9492c1 into master Jul 26, 2024
37 checks passed
@1uc 1uc deleted the 1uc/optimize-hyperslab-selection branch July 26, 2024 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants