LLM Toaster

Overview

LLM Toster is a library designed for training, fine-tuning, and running inference on Transformer-based language models. It streamlines the process of dataset loading, model configuration, and execution.

Installation

To install the LLM Toster library, navigate to the root directory and run the following command:

pip install -e .

This will install all necessary packages and dependencies.

Configuration

All configurations can be found in the llm_toaster/config directory. Ensure to review and modify these configurations according to your setup and requirements.

Training

Step 1) data download and tokenization

To download and tokenize the fineweb-edu dataset from Hugging Face, navigate to the dataspace/src directory and run:

python download_tokenize_hf.py

This command will take a file to download and tokenize about 27GB (after tokenization, the size will be reduced to about 10GB)

Step 2) Training

To train the model, navigate to the llm_toaster directory and run:

python trainer.py

Step 3) Continue training

You can decide to stop the training and continue afterwards. To continue training from the last checkpoint run:

python trainer.py -ct

NOTE: progress is saved only during a checkpoint which is automatically taken when the loss is reduced.

Inference

Option 1) Train your model

If you just trained your model, then run the following script to extract the model from the checkpoint (a checkpoint consists of a model, an optimizer, and a scaler)

python extract_inference_model.py

This will extract the model from a checkpoint saved in model/checkpoints and save it under the model/babyGPT

Option 2) Download a pretrained model

You can download a pretrained babyGPT model from HERE and save it under llm_toaster/model/babyGPT directory. To prompt the model, use the following command:

python inference.py -p="Your prompt here"

You can continue interacting with the model by providing new prompts, or type exit to quit.

Troubleshooting

If you encounter any issues, please check the following:

Ensure all dependencies are installed.
Verify the configurations in config/config.yaml.
Make sure the dataset is downloaded and tokenized correctly.

Contributing

Contributions are welcome! Please fork the repository and submit a pull request with your improvements or bug fixes.

License

LLM Toster is released under the MIT License. See LICENSE for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
assets/images		assets/images
config		config
dataspace		dataspace
model		model
tokenizer_lib		tokenizer_lib
utils		utils
.gitignore		.gitignore
README.md		README.md
extract_inference_model.py		extract_inference_model.py
inference.py		inference.py
setup.py		setup.py
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Toaster

Overview

Installation

Configuration

Training

Step 1) data download and tokenization

Step 2) Training

Step 3) Continue training

Inference

Option 1) Train your model

Option 2) Download a pretrained model

Troubleshooting

Contributing

License

About

Releases

Packages

Languages

amjadmajid/llm_toaster

Folders and files

Latest commit

History

Repository files navigation

LLM Toaster

Overview

Installation

Configuration

Training

Step 1) data download and tokenization

Step 2) Training

Step 3) Continue training

Inference

Option 1) Train your model

Option 2) Download a pretrained model

Troubleshooting

Contributing

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages