-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use faster yaml loader and better delayed loading #16
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -36,7 +36,9 @@ class SldyImage: | |
_data_paths: typing.Set[pathlib.Path] = set() | ||
|
||
@staticmethod | ||
def _yaml_mapping(loader: yaml.Loader, node: yaml.Node, deep: bool = False) -> dict: | ||
def _yaml_mapping( | ||
loader: yaml.CLoader, node: yaml.Node, deep: bool = False | ||
) -> dict: | ||
""" | ||
Static method intended to map key-value pairs found in image | ||
metadata yaml files to Python dictionaries. | ||
|
@@ -108,7 +110,7 @@ def _get_yaml_contents( | |
""" | ||
try: | ||
with fs.open(yaml_path) as f: | ||
return yaml.load(f, Loader=yaml.Loader) | ||
return yaml.load(f, Loader=yaml.CLoader) | ||
except FileNotFoundError: | ||
if is_required: | ||
raise | ||
|
@@ -171,7 +173,7 @@ def __init__( | |
yaml.add_constructor( | ||
yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG, | ||
SldyImage._yaml_mapping, | ||
yaml.Loader, | ||
yaml.CLoader, | ||
) | ||
|
||
self._fs = fs | ||
|
@@ -219,6 +221,14 @@ def __init__( | |
self.timepoints = sorted(self._timepoint_to_data_paths.keys()) | ||
self.channels = sorted(self._channel_to_data_paths.keys()) | ||
|
||
self.sizeT = self._image_record["CImageRecord70"]["mNumTimepoints"] | ||
self.sizeC = self._image_record["CImageRecord70"]["mNumChannels"] | ||
self.sizeZ = self._image_record["CImageRecord70"]["mNumPlanes"] | ||
self.sizeY = self._image_record["CImageRecord70"]["mHeight"] | ||
self.sizeX = self._image_record["CImageRecord70"]["mWidth"] | ||
# TODO check this but are all sldys uint16? bioformats seems to say so | ||
self.dtype = np.dtype(np.uint16) | ||
Comment on lines
+224
to
+230
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Quick check: have we found any files that have pixel data <-> metadata conflicts? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assume you are putting these values into their own attributes because we don't have a system in place for storing pre-read / pre-load dims? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, I made a ticket in bioio-base about this. Even if the xarray delayed setup is "fast", it would be still faster to return dims from metadata. We had czi files that had pixel data size different from metadata size but as I recall it was generally because of unexpected termination of the image capture. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should just move to standardize this sooner rather than later and create a The docs for the attr should just mention that we have no guarantee that the dims are correct due to "early stopping of image acquisition" |
||
|
||
@property | ||
def metadata(self) -> typing.Dict[str, typing.Optional[dict]]: | ||
""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were previously loading the whole file one timepoint+channel pair at a time??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YES 😢 at least it was in memmap mode but still.... the test file is pretty big. Each one of these calls had a noticeable hitch of less than a second but with the delayed code they return basically immediately.