do not try to infer the `document_type` from the function in `(Iterable)Dataset.map()` #337

ArneBinder · 2023-09-13T17:42:33Z

With this PR, we re-use the original document type when calling (Iterable)Dataset.map() if no result_document_type is provided. This clarifies the semantics because we do not change the document type except that it is explicitly provided. With this we also do not rely on (maybe wrong) return type annotations anymore and we can use generic functions in map that do not change the document type at all (e.g. trimming spans) which is a quite common case.

This is a breaking change for the case that map() is called with a function that has an annotated return type which is different than the original document type of the dataset.

…Dataset.map()

…e#337

* use pytorch-ie@dont_infer_doc_type_in_dataset_map * remove result_document_type because of ArneBinder/pytorch-ie#337 * add comments that reference the source code * use pytorch-ie 0.24.0 since it was released with ArneBinder/pytorch-ie#337

do not try to infer the document_type from the function in (Iterable)…

7abb185

…Dataset.map()

ArneBinder added the breaking Breaking Changes label Sep 13, 2023

ArneBinder added a commit to ArneBinder/pytorch-ie-hydra-template-1 that referenced this pull request Sep 13, 2023

remove result_document_type because of ArneBinder/pytorch-ie#337

1ee1e3b

ArneBinder mentioned this pull request Sep 13, 2023

don't infer result_document_type in (Iterable)Dataset.map() ArneBinder/pytorch-ie-hydra-template-1#126

Merged

ArneBinder merged commit b78f7fc into main Sep 13, 2023

ArneBinder deleted the dont_infer_doc_type_in_dataset_map branch September 13, 2023 18:24

ArneBinder mentioned this pull request Sep 13, 2023

_infer_document_type_from_function_return() should return None instead of raising an exception #295

Closed

ArneBinder added a commit to ArneBinder/pytorch-ie-hydra-template-1 that referenced this pull request Sep 13, 2023

use pytorch-ie 0.24.0 since it was released with ArneBinder/pytorch-i…

4e65a0f

…e#337

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

do not try to infer the `document_type` from the function in `(Iterable)Dataset.map()` #337

do not try to infer the `document_type` from the function in `(Iterable)Dataset.map()` #337

ArneBinder commented Sep 13, 2023 •

edited

Loading

do not try to infer the document_type from the function in (Iterable)Dataset.map() #337

do not try to infer the document_type from the function in (Iterable)Dataset.map() #337

Conversation

ArneBinder commented Sep 13, 2023 • edited Loading

do not try to infer the `document_type` from the function in `(Iterable)Dataset.map()` #337

do not try to infer the `document_type` from the function in `(Iterable)Dataset.map()` #337

ArneBinder commented Sep 13, 2023 •

edited

Loading