Skip to content
forked from meta-llama/llama

Mixture-of-depths with Llama models

License

Notifications You must be signed in to change notification settings

triflt/llama-mod

 
 

Repository files navigation

Mixture of Depths

Comparison "Mixture-of-Depths" with default Llama architecture. Inspired by meta-llama and sramshetty/mixture-of-depths

Start

  1. Look how to setup env for Llama 2 here

  2. Install reqirements

pip install -r "reqirements.txt"
  1. Try out the notebook training.ipynb

Result:

Comparison

The value of the loss function is up to 40% lower for the MoD architecture than for the default Llama.

About

Mixture-of-depths with Llama models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 84.1%
  • Jupyter Notebook 15.9%