Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batch_encode_plus doesn't work correctly #1704

Open
tempdeltavalue opened this issue Dec 18, 2024 · 1 comment
Open

batch_encode_plus doesn't work correctly #1704

tempdeltavalue opened this issue Dec 18, 2024 · 1 comment

Comments

@tempdeltavalue
Copy link

tempdeltavalue commented Dec 18, 2024

code here:
https://github.com/tempdeltavalue/temp_l/blob/main/finetune_seq2seq.ipynb

Screenshot 2024-12-18 150037
Screenshot 2024-12-18 150203

https://discuss.huggingface.co/t/repetitive-words-in-model-output/132085/2

@tempdeltavalue tempdeltavalue changed the title batch_encode doesn't work correctly batch_encode_plus doesn't work correctly Dec 18, 2024
@tempdeltavalue
Copy link
Author

same with tokenizer() batch encoding
Screenshot 2024-12-18 151941
Screenshot 2024-12-18 152002

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant