NanoBERT

A simple implementation of BERT to imrpove our understanding and for fun.

Co-authors: Mounes Zaval, Zeynep Akkoc

Details

We implemented a simple minimalist BERT model with ALiBi. We coded a class that adds an MLM head and trained the model on an Arabic corpus treating each letter as a token.

Training over Epochs

The graph above illustrates the train / test loss and scores of our NanoBERT model over 50 epochs. The decreasing trend in loss indicates the model's improving ability to predict masked tokens in the Quran corpus.

Usage

To train the model with the provided configurations, use the following command:

train.py \
--model_config_path configs/model_config.json \
--tokenizer_config_path configs/tokenizer_config.json \
--train_config_path configs/train_config.json \
--data_path data/quran.jsonl

This command specifies the paths to the model, tokenizer, and training configurations, as well as the data to be used for training. The model will be trained according to the specified configurations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

NanoBERT

Details

Training over Epochs

Usage

Files

README.md

Latest commit

History

README.md

File metadata and controls

NanoBERT

Details

Training over Epochs

Usage