QLORA Finetuning of LLAMA-7b

This README provides a simple demonstration and a quick summary tutorial on how to finetune the LLAMA-7b model using QLORA (Quantization for Language Representation) with the help of the TRL (Transformer Reinforcement Learning) library. We will also leverage quantization provided by bitsandbytes, which is integrated into the Hugging Face library.

Readings

Before you begin, make sure you have the following prerequisites installed:

TRL library: You can find the TRL library here.
Hugging face 4bit lesson hugging_face
PEFT: Parameter-Efficient Fine-Tuning peft
DeepLearningAI: Building with Instruction-Tuned LLMs: A Step-by-Step Guide deepAI

Dataset

The model will be finetuned on 2 chemistry datasets available on Hugging Face. The process of combining the datasets can be found in the notebook provided. Alternatively, if you're not interested in the dataset generation, you can directly download the combined dataset from my repository.

DATASET ONLY

The model will be finetuned on 2 chemistry datasets available on Hugging Face. The process of combining the datasets can be found in the notebook provided. Alternatively, if you're not interested in the dataset generation, you can directly download the combined dataset from my repository.

INFERENCE ONLY

The finetuned model has been uploaded to Hugging Face at hs_peer_support_chem. Please note that you would require GPU for both inferencing and training as cpu version is too slow.You can use the model for inference with the following code:

# Inference can be run separately from the training process
from peft import get_peft_model
import torch
import transformers
from peft import LoraConfig
from transformers import AutoModelForCausalLM, BitsAndBytesConfig, LlamaTokenizer

lora_config = LoraConfig.from_pretrained("supramantest/hs_peer_support_chem")
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

tokenizer = LlamaTokenizer.from_pretrained("supramantest/hs_peer_support_chem")
model = AutoModelForCausalLM.from_pretrained(
    lora_config.base_model_name_or_path,
    quantization_config=bnb_config,
    device_map={"":0})
model = get_peft_model(model, lora_config)

from IPython.display import display, Markdown

def make_inference(prompt, context = None):
  inputs = tokenizer(prompt, return_tensors="pt", return_token_type_ids=False).to("cuda:0")
  outputs = model.generate(**inputs, max_new_tokens=100)
  display(Markdown((tokenizer.decode(outputs[0], skip_special_tokens=True))))

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
HS_science_A_student_simulator.ipynb		HS_science_A_student_simulator.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QLORA Finetuning of LLAMA-7b

Readings

Dataset

DATASET ONLY

INFERENCE ONLY

About

Releases

Packages

Contributors 2

Languages

PaulSZH95/hs_peer_support_chem

Folders and files

Latest commit

History

Repository files navigation

QLORA Finetuning of LLAMA-7b

Readings

Dataset

DATASET ONLY

INFERENCE ONLY

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages