This accompanies the blog post: sidsite.com/posts/bert-from-scratch
See the pretraining code in pretraining_BERT.ipynb.
It's possible to fine tune a model after pretraining, using the run_finetuning.sh
script. Note that the parameters here are roughly based on Cramming, but I used different training parameters for two of the tasks. (TODO: add these.))