Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do not try to infer the document_type from the function in (Iterable)Dataset.map() #337

Merged
merged 1 commit into from
Sep 13, 2023

Conversation

ArneBinder
Copy link
Owner

@ArneBinder ArneBinder commented Sep 13, 2023

With this PR, we re-use the original document type when calling (Iterable)Dataset.map() if no result_document_type is provided. This clarifies the semantics because we do not change the document type except that it is explicitly provided. With this we also do not rely on (maybe wrong) return type annotations anymore and we can use generic functions in map that do not change the document type at all (e.g. trimming spans) which is a quite common case.

This is a breaking change for the case that map() is called with a function that has an annotated return type which is different than the original document type of the dataset.

@ArneBinder ArneBinder added the breaking Breaking Changes label Sep 13, 2023
ArneBinder added a commit to ArneBinder/pytorch-ie-hydra-template-1 that referenced this pull request Sep 13, 2023
@ArneBinder ArneBinder merged commit b78f7fc into main Sep 13, 2023
@ArneBinder ArneBinder deleted the dont_infer_doc_type_in_dataset_map branch September 13, 2023 18:24
ArneBinder added a commit to ArneBinder/pytorch-ie-hydra-template-1 that referenced this pull request Sep 13, 2023
ArneBinder added a commit to ArneBinder/pytorch-ie-hydra-template-1 that referenced this pull request Sep 13, 2023
* use pytorch-ie@dont_infer_doc_type_in_dataset_map

* remove result_document_type because of ArneBinder/pytorch-ie#337

* add comments that reference the source code

* use pytorch-ie 0.24.0 since it was released with ArneBinder/pytorch-ie#337
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking Changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant