-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEAT] ls/list_dir for AzureBlobStorage #1408
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #1408 +/- ##
==========================================
- Coverage 74.66% 74.64% -0.02%
==========================================
Files 60 60
Lines 6042 6042
==========================================
- Hits 4511 4510 -1
- Misses 1531 1532 +1 |
src/daft-io/src/azure_blob.rs
Outdated
.map(|container| { | ||
Ok(self._container_to_file_metadata(protocol.clone(), container)) | ||
}) | ||
.collect::<Vec<_>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to collect this if are you wrapping futures::stream::iter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, though I had to move all the arguments into the stream to do this; otherwise references escape the closure into the returned stream.
src/daft-io/src/azure_blob.rs
Outdated
// https://docs.rs/azure_storage_blobs/0.15.0/azure_storage_blobs/container/operations/list_blobs/struct.ListBlobsBuilder.html | ||
// https://docs.rs/azure_storage_blobs/0.15.0/azure_storage_blobs/service/operations/struct.ListContainersBuilder.html | ||
// https://docs.rs/azure_core/0.15.0/azure_core/struct.Pageable.html | ||
assert!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return an error here instead rather than asserting
src/daft-io/src/azure_blob.rs
Outdated
blob_item, | ||
)) | ||
}) | ||
.collect::<Vec<_>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also no need to collect here if we are feeding futures::stream::iter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also you could also just use then
and return the stream directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The source here is an iterator not a stream, so I can't use then
. Removed the collect though!
src/daft-io/src/azure_blob.rs
Outdated
.delimiter(delimiter.to_string()) | ||
.prefix(upper_dir.to_string()) | ||
.into_stream() | ||
.try_collect::<Vec<_>>() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Since the stream items are falliable Result<T>
and I want to propagate the error, I can't use any
.
Unfortunately, there is not a try_any
in TryStreamExt either. Instead, I implemented what would have been try_any
with .try_skip_while()
and try_next()
.
src/daft-io/src/azure_blob.rs
Outdated
upper_results_stream | ||
.map_ok(|file_info| (file_info.filepath == full_path_with_trailing_delimiter)) | ||
.try_skip_while(|is_match| futures::future::ready(Ok(!is_match))) | ||
.boxed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dont think you need to box
this to get the value
File listing for AzureBlobStorage via list_blobs(prefix) API. Closes
Manually checked:
Implementation notes:
az://
andabfs://
). Whichever protocol used is persisted.protocol://container/path-part/file
andprotocol://[email protected]/path-part/file
), but all results are translated to the first format.