Very Small Language Models. Inspired by Andrej Karpathy
This is a project by the West Coast Machine Learning group. We hope to create a small language model that can be trained in a few hours. We would like to experiment with various novel ideas as well as recreate experiments that others have create.
Intially this will be a google doc to make it easy for folks to edit. We will eventually move the list into this repository. https://docs.google.com/document/d/1zCrQ8nPTi2SWVoJ5N2FFLrme8E9I49I_nl9hFt1SykY/edit?usp=sharing
- Clone the nanoGPT repository and get the baseline running.
- We will use TinyStories to start with as our datas. Create a prepare.py for TinyStories to download the dataset
- Find a model size that can be reasonably trained in a couple hours
- Develop a method to test agains some standard benchmark