Below are some tips to port LLMs available on Hugging Face to MLX.
Before starting checkout the general contribution guidelines.
Next, from this directory, do an editable install:
pip install -e .
Then check if the model has weights in the safetensors format. If not follow instructions to convert it.
After that, add the model file to the
mlx_lm/models
directory. You can see other examples there. We recommend starting from a model
that is similar to the model you are porting.
Make sure the name of the new model file is the same as the model_type
in the
config.json
, for example
starcoder2.
To determine the model layer names, we suggest either:
- Refer to the Transformers implementation if you are familiar with the codebase.
- Load the model weights and check the weight names which will tell you about the model structure.
- Look at the names of the weights by inspecting
model.safetensors.index.json
in the Hugging Face repo.
To add LoRA support edit
mlx_lm/tuner/utils.py
Finally, add a test for the new modle type to the model tests.
From the llms/
directory, you can run the tests with:
python -m unittest discover tests/