This work aims Knowledge Distillation from Google BERT model to compact Convolutional Models. (Not done yet)
Python > 3.6, fire, tqdm, tensorboardx, tensorflow (for loading checkpoint file)
Download BERT-Base, Uncased and GLUE Benchmark Datasets before fine-tuning.
- make sure that "total_steps" in train.json should be greater than n_epochs*(num_data/batch_size)
Modify several config json files before following commands for training and evaluating.
python finetune.py config/finetune/mrpc/train.json
python finetune.py config/finetune/mrpc/eval.json
See Transformer to CNN. Modify several config json files before following commands for training and evaluating.
python classify.py config/blendcnn/mrpc/train.json
python classify.py config/blendcnn/mrpc/eval.json
Modify several config json files before following commands for training and evaluating.
python distill.py config/distill/mrpc/train.json
python distill.py config/distill/mrpc/eval.json