You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, Datumaro only supports images as media type, however there are lots of other media types used in computer vision datasets. Moreover, OpenVINO is not limited by vision tasks, and it includes, for instance, NLP models. For Datumaro, it is essential direction of growth.
Make better separation between modules wrt supported media type. For example, there is a number of operations than can only work with (and make sense for) image datasets. They need to be clearly distinguished in API and should not be applicable to other media types
Consider interaction with annotation types. It is clear that not every type of annotations can be applicable for each media type. The question is how to differentiate them and whether it is needed at all
Allow API users to provide their own media types
Transition period tasks:
In the first step, we make Image default media type, which can be changed. However, it needs to be changed to undefined later. Need to remove the default media_type value in Extractor c-tor and require this info from the Extractor (and other IDataset children). Also add changes to the extractors and tests that are needed.
The text was updated successfully, but these errors were encountered:
zhiltsov-max
changed the title
Need to change from media_type=Image to media_type=None (default value)
Support different media types
Feb 22, 2022
- Added `DatasetItem.media` to replace dedicated members for each media type
- Added the `PointCloud` media type
- Added the `media_type()` method to `Extractor`s
- Added merging for all media types, mixed media types for an item or in the dataset produce an error
- Datasets can't have mixed media types in items. If such situation occurs, an error is raised (checked during dataset caching/iteration)
- Datasets can't change media type using transforms
- Extractors must report their media type with the `media_type()` method
- Added a new mandatory `media_type` argument to `Dataset.from_iterable`. It has a default value of `Image` for the transition period (to be tracked in #675).
- Deprecated `DatasetItem.image`, `.related_images`, `.point_cloud`, `save-images` and `require_images`
- Added deprecation messages about annotation classes in `components.extractor`
- Suppressed Datumaro deprecation messages when using Datumaro from CLI
Co-authored-by: yasakova-anastasia <[email protected]>
Related #129
Related #135
Related #136
Currently, Datumaro only supports images as media type, however there are lots of other media types used in computer vision datasets. Moreover, OpenVINO is not limited by vision tasks, and it includes, for instance, NLP models. For Datumaro, it is essential direction of growth.
Tasks:
Transition period tasks:
Image
default media type, which can be changed. However, it needs to be changed to undefined later. Need to remove the defaultmedia_type
value inExtractor
c-tor and require this info from the Extractor (and otherIDataset
children). Also add changes to the extractors and tests that are needed.The text was updated successfully, but these errors were encountered: