Finetune Falcon 7B LLM on a Single RTX 4090 on WSL

I have a single RTX 4090 installed on Windows Host. I am using Ubuntu 22.04 as a Guest running in WSL2 to finetune the 7B model. This is a formidable task because recompile bitsandbytes is needed.

QLoRA on WSL

I use QLoRA to fit the large LLM into a Single GPU:https://arxiv.org/abs/2305.14314. I need to compile bitsandbytes to make it work on WSL.

WSL guest needs to leverage host's GPU. The GPU driver installed on host, the CUDA Development Kit needs to install on Guest. Default CUDA Development Kit contains GPU driver. It won't work if install the default CUDA Development Kit.
Install it using below command works:

wget https://developer.download.nvidia.com/compute/cuda/12.1.1/local_installers/cuda_12.1.1_530.30.02_linux.run
sudo sh cuda_12.1.1_530.30.02_linux.run

Update system path and variable in ~/.bashrc. Note the LD_LIBRARY_PATH needs to contains two libraries, one is the stub refer to the host, the other one is the one installed in local

export PATH="/usr/local/cuda-12.1/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda-12.1/lib64:/usr/lib/wsl/lib"

double check to confirm NVCC is installed and fully working
Then I can compile bitsandbytes with GPU support

Training

only ~0.1% of all parameters are trained, the performance was amazing. Thanks again for the LoRA and QLoRA
this notebook contains training script, it's here in the repo: ./fintune-falcon/qlora.ipynb
I am able to push batch size to 8. I think 16 may also work. The smoother loss is the batch_size=8:

Appendix

Falcon 40 B is at the top of Open LLM Leaderborad at the time I attempt the finetuning. And it's a commercial free license.
If follow this CUDA Toolkit installation (recommended by Nvidia), it installs the toolkit, but no NVCC driver. So it doesn't work. https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=WSL-Ubuntu&target_version=2.0&target_type=deb_local

Credits

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
benchmarking/switchback		benchmarking/switchback
bitsandbytes		bitsandbytes
csrc		csrc
examples		examples
fintune-falcon		fintune-falcon
include		include
tests		tests
.buckconfig		.buckconfig
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
NOTICE.md		NOTICE.md
README.md		README.md
bnb_README.md		bnb_README.md
check_bnb_install.py		check_bnb_install.py
compile_from_source.md		compile_from_source.md
cuda_install.sh		cuda_install.sh
deploy.sh		deploy.sh
environment.yml		environment.yml
errors_and_solutions.md		errors_and_solutions.md
howto_config_override.md		howto_config_override.md
image-1.png		image-1.png
image-2.png		image-2.png
image.png		image.png
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Finetune Falcon 7B LLM on a Single RTX 4090 on WSL

About

Releases

Packages

Languages

License

yueming-zhang/QLoRA-Finetune-Falcon7B

Folders and files

Latest commit

History

Repository files navigation

Finetune Falcon 7B LLM on a Single RTX 4090 on WSL

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages