BERT_Chinese_MRC

基于BERT官方源码做修改，适配中文QA任务DRCD。

Inspired by BERT-for-Chinese-Question-Answering

改动

基于read_squad_example.py，修改中文的tokenization，去除无法匹配answer_start的数据
ToDo

使用

Train&Prediction

python run_drcd.py \
  --vocab_file=$BERT_MODEL_DIR/vocab.txt \
  --bert_config_file=$BERT_MODEL_DIR/bert_config.json \
  --init_checkpoint=$BERT_MODEL_DIR/bert_model.ckpt \
  --do_train=True \
  --train_file=$DRCD_DIR/DRCD_training.json \
  --do_predict=True \
  --predict_file=$DRCD_DIR/DRCD_test.json \
  --train_batch_size=6 \
  --learning_rate=3e-5 \
  --num_train_epochs=3.0 \
  --do_lower_case=True \
  --max_seq_length=512 \
  --doc_stride=128 \
  --output_dir=$OUTPUT_DIR/

Evaluate

pyton eva.py $DRCD/DRCD_testing.json $OUTPUT_DIR/prediction.json

结果

EM: 85.65702834239909
F1: 91.78050628879733

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
DRCD		DRCD
result		result
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
eval.py		eval.py
modeling.py		modeling.py
optimization.py		optimization.py
run_drcd.py		run_drcd.py
tokenization.py		tokenization.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BERT_Chinese_MRC

改动

使用

结果

About

Releases

Packages

Languages

colinsongf/BERT_Chinese_MRC_drcd

Folders and files

Latest commit

History

Repository files navigation

BERT_Chinese_MRC

改动

使用

结果

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages