Skip to content

Commit

Permalink
Create tokenizer_config.json
Browse files Browse the repository at this point in the history
  • Loading branch information
Stardust-minus authored Sep 23, 2023
1 parent df1070b commit c1a264f
Showing 1 changed file with 10 additions and 0 deletions.
10 changes: 10 additions & 0 deletions bert/bert-large-japanese-v2/tokenizer_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"tokenizer_class": "BertJapaneseTokenizer",
"model_max_length": 512,
"do_lower_case": false,
"word_tokenizer_type": "mecab",
"subword_tokenizer_type": "wordpiece",
"mecab_kwargs": {
"mecab_dic": "unidic_lite"
}
}

0 comments on commit c1a264f

Please sign in to comment.