-
-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrapping labels over array boundaries #344
Conversation
Can one of the admins verify this patch? Admins can comment |
Hello! Converting 2D to nD is my bread and butter. 😃 From an API perspective, rather than a string, we should do (imho), Then, instead of:
You would do:
The only downside is that this won't wrap around corners, ie I think the code above suffers from #320 at the corners. That seems like a super tricky fix so I'd personally be happy to punt on that to a later PR, as it seems pretty niche. As a clue to a fix, I think that if you have multiple axes and you have a structure with connectivity>1, you would need to examine the wrapped faces from earlier axes and add wrapped faces to those faces. It gets confusing. 😅 But again, I'd do this in a later PR, and perhaps add a known failing test with pytest.mark.xfail for:
|
Awesome, thanks @jni! I threw a |
def label_adjacency_graph(labels, structure, nlabels, wrap=None): | ||
def set_tup_value(tup, idx, value): | ||
"""Return a copy of `tup` with `value` at `idx`.""" | ||
return tuple((elem if i == idx else value) for i, elem in enumerate(tup)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I changed this to ==
. I guess this comes down to how one thinks about wrapping. As it is now, setting wrap_axes=(0)
wraps labels across the boundary of the 0th axis, whereas !=
would wrap the array over the 0th axis and wrap labels across the boundary of the 1st axis. Could use either one really, depends on which is more intuitive I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I'm a bit confused here. To me, indicating the axes over which dask_image.ndmeasure.label
should consider the input array to wrap over is most intuitive :) (which is what I think this code does)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree. Just wanted to comment it since it differs from the suggestion of @jni.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The suggestion from @jni was pretty abstract and untested and he well could have gotten a sign wrong somewhere 😅 Thanks @Holmgren825 and @m-albert! Sorry the next few weeks are very busy for me so I may not be able to do an in-depth review, but I'll try. Please ping me if there is a rush/you specifically want an extra pair of eyes on something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, indicating the axes over which dask_image.ndmeasure.label should consider the input array to wrap over is most intuitive :) (which is what I think this code does)
I had been wrong earlier in the sense that the code above should be saying !=
as jni suggested earlier. Because the idea is to only replace the element at index idx
and leave other elements unchanged (those with i != idx
).
Sorry the next few weeks are very busy for me so I may not be able to do an in-depth review, but I'll try. Please ping me if there is a rush/you specifically want an extra pair of eyes on something.
Thanks for your availability @jni :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, same here. I guess I got confused thinking about array axes and geographical axes for some reason. Fixed in 3aaeaca.
Thanks for submitting this PR @Holmgren825.
@jni I agree that it's a bit of a niche use-case, but I think the logic in Here's what I mean: To support wrap mode, this iteration could simply start at indices -1 instead of 0 for each axis indicated by As a comment that is independent of support for connectivity>1: probably it makes sense that all code for determining relevant chunk faces lives in the same place, i.e. currently within |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Holmgren825 I left some comments next to the code :)
faces = [] | ||
|
||
for face_slice in face_slices: | ||
faces.append(labels[face_slice]) | ||
|
||
if wrap_axes is not None: | ||
for ax in wrap_axes: | ||
none_slice = (slice(None),) * labels.ndim | ||
sl_back = set_tup_value(none_slice, ax, [-1]) | ||
sl_front = set_tup_value(none_slice, ax, [0]) | ||
faces.append( | ||
da.stack([labels[sl_back], labels[sl_front]], axis=ax).squeeze() | ||
) | ||
|
||
for face in faces: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this logic could live inside of _chunk_faces
by extending the existing implementation (tried to explain what I mean here #344 (comment)). Also, in this way all code determining the chunk faces to consider would live together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @m-albert! I agree that it would nicer to move this to _chunk_faces
, although I've struggled a bit to understand what's going on in it. One idea I had was that, in the main loop over the blocks, you could stack the bottom block on top when neigh_block[dim] >= numblocks[dim]
, but these slices do not wrap. So I went with a simpler approach and just moved the loop over the wrap_axes
to the end of _chunk_faces
, and added a slice that covers the corners of the array. This makes it pass the corner feature test case that previously failed. Lowering the connectivity to one for this case returns two features despite wrapping both axes, which I think is correct.
def test_label_wrap(a, a_res, wrap_axes): | ||
d = da.from_array(a, chunks=(5, 5)) | ||
|
||
s = np.ones((3, 3)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As jni commented, using structuring elements with connectivity > 1 would lead to problems in the corners. scipy.ndimage.morphology.generate_binary_structure
is a nice convenience function for creating structuring elements.
def label_adjacency_graph(labels, structure, nlabels, wrap=None): | ||
def set_tup_value(tup, idx, value): | ||
"""Return a copy of `tup` with `value` at `idx`.""" | ||
return tuple((elem if i == idx else value) for i, elem in enumerate(tup)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I'm a bit confused here. To me, indicating the axes over which dask_image.ndmeasure.label
should consider the input array to wrap over is most intuitive :) (which is what I think this code does)
add to allowlist |
- Renaming `set_tup_value` to `get_slice_tuple` - Correcting `get_slice_tuple`. - Fixing corners higher dimension arrays. - Added test cases for 3d.
After some “field” testing on real data, I've further reworked how the
Maybe these tests are a bit overkill, they add a lot of hard coded arrays, but it is tricky to test the wrapping without it. |
Hey @Holmgren825, thank you for addressing the comments above in your follow-up commits. And for adding more tests, which is super useful (also including a 3D example).
Exactly, I was wondering how to best do this since dask doesn't support fancy indexing yet. I really like your elegant solution of defining slices with a step of
I completely understand how the code in |
Great! Sounds good to me. |
@Holmgren825 Cool, thank you. I pushed the changes. All the tests you had added are passing :) Next to the discussed logic inside of Please feel free to raise things you don't agree with or that should be commented more clearly. |
Amazing! This is a great turnaround speed for this feature — which is quite unique in the Python world??? Like, I don't think you can do this with straight up ndimage??? How cool is that??? 👏 |
Very exciting indeed 😁 I think further reviews are welcome at this point. Generally speaking, I guess for most of its functionality However here there's an additional feature introduced, which in this case might actually not only be relevant in the context of dask arrays, but also beyond as discussed here and here. Therefore IMHO there's a tiny bit of scope creep. But the feature is useful to the community and has a straight forward implementation thanks to the chunked labelling functionality already living in |
Gentle ping @jakirkham @GenevieveBuckley. Would you have opinions or review comments on this PR? |
Just writing to say I have a strong opinion that this should go in. 😂 |
Time to take some action here :) I think this PR is ready. In case there are no additional comments or objections in the next days, I'd merge this later this week. |
Merged 🎉 Thank you @Holmgren825 for your contribution to dask-image and implementing this new feature that extends functionality in Thanks to @jni for the groundwork, reviewing and cheerful comments! Thanks also to @rabernat for starting the conversation on this. |
Thanks Marvin for shepherding this in! Also thanks Juan for reviewing! And of course thanks Erik for reaching out and sharing this contribution with us Hopefully it helps others with similar needs 🙂 |
# get neighbor slice index | ||
ind_neigh_block = block_summary[tuple(neigh_block)] | ||
ind_curr_block = block_summary[tuple(curr_block)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a small change to this comment in PR: #353
I open this to get the discussion started on my hacky implementation of wrapping labels over array boundaries, as mentioned in #189. Currently, it is possible to wrap labels across one, or both, boundaries of a 2d array. This is relatively simple, since one can treat the “wrap” as another face in the
label_adjacency_graph
. However, there are some obvious things that need to be solved before merged:There are probably more things to figure out.