- setup LLaVA, with the extra libraries for training
On Uxix
:
a. Install Packages
conda create -n llava python=3.10 -y
conda activate llava
pip install --upgrade pip # enable PEP 660 support
pip install -e .
b. Install additional packages for training cases
pip install -e ".[train]"
pip install flash-attn --no-build-isolation
- also see README
- install git lfs:
chmod +x *.sh
./install-lfs__sr.sh
- get/create the data (change to use your data!)
pip install datasets
python prep_data__OK-VQA__sr.py
- download the base model:
./download_llava_weights__sr.sh
- review the script below, and the comments section.
You need to check the values for xxx to match your hardware.
cat ./train_qlora__wandb.sh
- execute the script:
./train_qlora__wandb.sh
- monitor the progress - if quality is high enough, you can stop the training early
- if you run out of GPU memory, then adjust the script to offload more work to CPU
- To test inference with the QLORA layer - run this script:
./infer_example.sh
To infer with a given prompt and image:
./infer_qlora_v1.5__wandb.sh <path to image> "my prompt"
To infer WITHOUT the lora layer (to see the behaviour BEFORE fine-tuning):
./infer_example__no_lora.sh