You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks so much for releasing the great source code. I finetuned LlaMA-2-7B using QLoRA successfully on a node of 8GPUs A100 NVIDIA, and saved the merged model well. However, when loading this created model with vLLM, I got the error below:
model = vllm.LLM(
[rank0]: ^^^^^^^^^
[rank0]: File "/miniconda3/envs/open-instruct/lib/python3.11/site-packages/vllm/entrypoints/llm.py", line 177, in __init__
[rank0]: self.llm_engine = LLMEngine.from_engine_args(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/miniconda3/envs/open-instruct/lib/python3.11/site-packages/vllm/engine/llm_engine.py", line 573, in from_engine_args
[rank0]: engine = cls(
[rank0]: ^^^^
[rank0]: **KeyError: 'layers.11.mlp.down_proj.weight'**
Loading safetensors checkpoint shards: 0% Completed | 0/3 [00:00<?, ?it/s]
It seems the difference about the structure when loading the model. Could you please give me some suggestions to fix it? Many thanks for your help!
The text was updated successfully, but these errors were encountered:
Hi Authors,
Thanks so much for releasing the great source code. I finetuned LlaMA-2-7B using QLoRA successfully on a node of 8GPUs A100 NVIDIA, and saved the merged model well. However, when loading this created model with vLLM, I got the error below:
It seems the difference about the structure when loading the model. Could you please give me some suggestions to fix it? Many thanks for your help!
The text was updated successfully, but these errors were encountered: