You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear
I have fine tune llama 2 model. And then I am using below merge and upload functionality to merge model
`
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, LlamaTokenizer, StoppingCriteria, StoppingCriteriaList, TextIteratorStreamer
print(f"Starting to load the model {model_name} into memory")
m = AutoModelForCausalLM.from_pretrained(
model_name,
#load_in_4bit=True,
torch_dtype=torch.bfloat16,
device_map={"": 0}
)
m = PeftModel.from_pretrained(m, adapters_name)
m = m.merge_and_unload()
tok = LlamaTokenizer.from_pretrained(model_name)
tok.bos_token_id = 1
stop_token_ids = [0]
`
which merge succesfully, but when I am using text generation inference below code. it shows an error
ValueError: Non-consecutive added token '' found. Should have index 32000 but has index 0 in saved vocabulary.
docker run --gpus all --shm-size 1g -p 8080:80 -v /datadrive:/data ghcr.io/huggingface/text-generation-inference:1.0.3 --model-id '/data/Azure_Backup/shrinath_merged_model_20_10' --quantize bitsandbytes-nf4 --env --num-shard 1
can you help me?
Thanks
The text was updated successfully, but these errors were encountered:
Try updating to the latest version of Transformers and repeating the merge. There was a recent PR that might fix this issue: huggingface/transformers#26570
Dear
I have fine tune llama 2 model. And then I am using below merge and upload functionality to merge model
`
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, LlamaTokenizer, StoppingCriteria, StoppingCriteriaList, TextIteratorStreamer
model_name = "meta-llama/Llama-2-13b-hf"
adapters_name = 'Llama-13b_17_10'
print(f"Starting to load the model {model_name} into memory")
m = AutoModelForCausalLM.from_pretrained(
model_name,
#load_in_4bit=True,
torch_dtype=torch.bfloat16,
device_map={"": 0}
)
m = PeftModel.from_pretrained(m, adapters_name)
m = m.merge_and_unload()
tok = LlamaTokenizer.from_pretrained(model_name)
tok.bos_token_id = 1
stop_token_ids = [0]
`
which merge succesfully, but when I am using text generation inference below code. it shows an error
ValueError: Non-consecutive added token '' found. Should have index 32000 but has index 0 in saved vocabulary.
docker run --gpus all --shm-size 1g -p 8080:80 -v /datadrive:/data ghcr.io/huggingface/text-generation-inference:1.0.3 --model-id '/data/Azure_Backup/shrinath_merged_model_20_10' --quantize bitsandbytes-nf4 --env --num-shard 1
can you help me?
Thanks
The text was updated successfully, but these errors were encountered: