Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

save entries of token_offset_mapping as lists in document metadata #356

Merged
merged 2 commits into from
Oct 5, 2023

Conversation

ArneBinder
Copy link
Owner

@ArneBinder ArneBinder commented Oct 5, 2023

... to fix deserialization. Serializing and deserializing token_offset_mapping produces lists of lists in any way (json does not know tuples), so we save the entries (start and end offset) as lists in the first place. Otherwise, this breaks equality checks with deserialized documents.

@ArneBinder ArneBinder added the bug Something isn't working label Oct 5, 2023
@ArneBinder ArneBinder changed the title save entries of token_offset_mapping as lists in metadata to fix deserialization save entries of token_offset_mapping as lists in metadata to fix deserialization Oct 5, 2023
@ArneBinder ArneBinder changed the title save entries of token_offset_mapping as lists in metadata to fix deserialization save entries of token_offset_mapping as lists in document metadata Oct 5, 2023
@ArneBinder ArneBinder merged commit 72c2f89 into main Oct 5, 2023
6 checks passed
@ArneBinder ArneBinder deleted the token_offset_mapping_lists branch October 5, 2023 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant