Skip to content

Commit

Permalink
Added few more test cases in test_encode_decode_round_trip and modefi…
Browse files Browse the repository at this point in the history
…ed the slow token (mask_token) to have AddedToken instance with lstrip=True
  • Loading branch information
Kokane authored and nileshkokane01 committed Nov 30, 2023
1 parent acd4276 commit 99deea6
Show file tree
Hide file tree
Showing 2 changed files with 341 additions and 40 deletions.
2 changes: 1 addition & 1 deletion src/transformers/models/rembert/tokenization_rembert.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ def __init__(
**kwargs,
):
# Mask token behave like a normal word, i.e. include the space before it
mask_token = AddedToken("[MASK]", lstrip=True, rstrip=False, normalized=False)
mask_token = AddedToken(mask_token, lstrip=True, rstrip=False) if isinstance(mask_token, str) else mask_token

self.do_lower_case = do_lower_case
self.remove_space = remove_space
Expand Down
Loading

0 comments on commit 99deea6

Please sign in to comment.