Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using phi-3 and LLava but some fields of phi3 network not support #7

Closed
hellangleZ opened this issue Apr 30, 2024 · 21 comments
Closed

Comments

@hellangleZ
Copy link

hellangleZ commented Apr 30, 2024

Hi team:

Every steps are copied your wizard to do

Like same dataset, and use the newest repo also upload llava to neweast

Also copy every new python files to Llava folder

But, still report this error alert

image

image

It only only occur at pretrain train process, also in FT process, and after FT, the model should not be used.

Myscrpt:

#!/bin/bash

deepspeed llava/train/train_mem.py
--deepspeed ./scripts/zero2.json
--model_name_or_path /data2/phi3-instruct/
--version plain
--data_path ./playground/data/blip_laion_cc_sbu_558k.json
--image_folder ./playground/data/images
--vision_tower openai/clip-vit-large-patch14-336
--mm_projector_type mlp2x_gelu
--tune_mm_mlp_adapter True
--mm_vision_select_layer -2
--mm_use_im_start_end False
--mm_use_im_patch_token False
--bf16 True
--output_dir ./checkpoints/llava-v1.5-phi3-mini-pretrain_v2
--num_train_epochs 1
--per_device_train_batch_size 32
--per_device_eval_batch_size 4
--gradient_accumulation_steps 1
--evaluation_strategy "no"
--save_strategy "steps"
--save_steps 24000
--save_total_limit 1
--learning_rate 1e-3
--weight_decay 0.
--warmup_ratio 0.03
--lr_scheduler_type "cosine"
--logging_steps 1
--tf32 True
--model_max_length 2048
--gradient_checkpointing True
--dataloader_num_workers 4
--lazy_preprocess True
--report_to "tensorboard"

@hellangleZ hellangleZ changed the title Using phi-3 and Using phi-3 and LLava Apr 30, 2024
@hellangleZ hellangleZ changed the title Using phi-3 and LLava Using phi-3 and LLava but some field Apr 30, 2024
@hellangleZ hellangleZ changed the title Using phi-3 and LLava but some field Using phi-3 and LLava but some fields of phi3 network not support Apr 30, 2024
@hanoonaR
Copy link
Member

Hi @hellangleZ ,

Regarding the warnings and errors you're encountering:

  1. Warnings about using a model of a different type (e.g., phi3 for llava_phi) can typically be ignored if the model is functioning as expected. ('You are using a model of type phi3 to instantiate a model of type llava_phi. This is not supported for all configurations of models and can yield errors.' and 'you should train this model on downstream task..')

  2. Concerning the error with some weights not being initialized from the checkpoint: it appears there might be a mix-up in the model class being initialized. Please ensure that you are initializing the LlavaPhiForCausalLM instead of LlavaLlamaForCausalLM with the checkpoint microsoft/Phi-3-mini-4k-instruct. The __init__.py should be set correctly to import LlavaPhiForCausalLM. You can do this by executing:

cp Phi-3-V/main__init__.py LLaVA/llava/__init__.py

Please double-check these details, and if the issue persists, we can investigate further.

@Luo-Z13
Copy link

Luo-Z13 commented Apr 30, 2024

Hello @hanoonaR , there are tokenization mismatch warnings when finetuning LLaMA3-V:

WARNING: tokenization mismatch: 240 vs. 243. (ignored)
WARNING: tokenization mismatch: 452 vs. 455. (ignored)
WARNING: tokenization mismatch: 439 vs. 442. (ignored)

with transformers==4.41.0.dev0 and tokenizers==0.19.1, is this issue related to the tokenizers version like haotian-liu/LLaVA#661 ?

@hellangleZ
Copy link
Author

Hi @hellangleZ ,

Regarding the warnings and errors you're encountering:

  1. Warnings about using a model of a different type (e.g., phi3 for llava_phi) can typically be ignored if the model is functioning as expected. ('You are using a model of type phi3 to instantiate a model of type llava_phi. This is not supported for all configurations of models and can yield errors.' and 'you should train this model on downstream task..')
  2. Concerning the error with some weights not being initialized from the checkpoint: it appears there might be a mix-up in the model class being initialized. Please ensure that you are initializing the LlavaPhiForCausalLM instead of LlavaLlamaForCausalLM with the checkpoint microsoft/Phi-3-mini-4k-instruct. The __init__.py should be set correctly to import LlavaPhiForCausalLM. You can do this by executing:

cp Phi-3-V/main__init__.py LLaVA/llava/__init__.py

Please double-check these details, and if the issue persists, we can investigate further.

Hi @hellangleZ ,

Regarding the warnings and errors you're encountering:

  1. Warnings about using a model of a different type (e.g., phi3 for llava_phi) can typically be ignored if the model is functioning as expected. ('You are using a model of type phi3 to instantiate a model of type llava_phi. This is not supported for all configurations of models and can yield errors.' and 'you should train this model on downstream task..')
  2. Concerning the error with some weights not being initialized from the checkpoint: it appears there might be a mix-up in the model class being initialized. Please ensure that you are initializing the LlavaPhiForCausalLM instead of LlavaLlamaForCausalLM with the checkpoint microsoft/Phi-3-mini-4k-instruct. The __init__.py should be set correctly to import LlavaPhiForCausalLM. You can do this by executing:

cp Phi-3-V/main__init__.py LLaVA/llava/__init__.py

Please double-check these details, and if the issue persists, we can investigate further.

Hi @hanoonaR

Still like this
image

And Actually my init is already support LLavaphi

1714474591059

@hanoonaR
Copy link
Member

hanoonaR commented Apr 30, 2024

Hi @Luo-Z13 ,

Based on your description, it sounds like there might be a configuration issue with the tokenizer or model.

Here are a few steps to resolve the issue:

  1. Ensure that you are using the correct conversation template, specifically conv_llama3 by setting --verion=llama3, as mismatches in templates can lead to tokenization issues.
  2. Confirm that you are using the correct model meta-llama/Meta-Llama-3-8B-Instruct for this particular task. Using an incompatible model can also result in tokenization mismatches.
  3. Could you please verify if setting your tokenizers - 0.19.1 and and transformers - 4.41.0.dev0 libraries specifically to versions respectively, resolves the issue? Sometimes, even minor version differences can cause unexpected behavior.

Please try these suggestions and let us know if the problem persists.

@hanoonaR
Copy link
Member

hanoonaR commented Apr 30, 2024

@hellangleZ

Can you please show the error message - not just the parameter names, please. What does the error message say in "Some weights of ..."

@hanoonaR
Copy link
Member

hanoonaR commented Apr 30, 2024

@hellangleZ ,

Can you try and see if this solves the issue?

cd LLaVA
export PYTHONPATH="./:$PYTHONPATH"

@hellangleZ
Copy link
Author

@hellangleZ

Can you please show the error message - not just the parameter names, please. What does the error message say in "Some weights of ..."

Hi @hanoonaR
the snapshot is the errpr message , and the log is full error
image

You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.09it/s]
Some weights of LlavaLlamaForCausalLM were not initialized from the model checkpoint at /data2/phi3-instruct/ and are newly initialized: ['model.layers.0.mlp.gate_proj.weight', 'model.layers.0.mlp.up_proj.weight', 'model.layers.0.self_attn.k_proj.weight', 'model.layers.0.self_attn.q_proj.weight', 'model.layers.0.self_attn.v_proj.weight', 'model.layers.1.mlp.gate_proj.weight', 'model.layers.1.mlp.up_proj.weight', 'model.layers.1.self_attn.k_proj.weight', 'model.layers.1.self_attn.q_proj.weight', 'model.layers.1.self_attn.v_proj.weight', 'model.layers.10.mlp.gate_proj.weight', 'model.layers.10.mlp.up_proj.weight', 'model.layers.10.self_attn.k_proj.weight', 'model.layers.10.self_attn.q_proj.weight', 'model.layers.10.self_attn.v_proj.weight', 'model.layers.11.mlp.gate_proj.weight', 'model.layers.11.mlp.up_proj.weight', 'model.layers.11.self_attn.k_proj.weight', 'model.layers.11.self_attn.q_proj.weight', 'model.layers.11.self_attn.v_proj.weight', 'model.layers.12.mlp.gate_proj.weight', 'model.layers.12.mlp.up_proj.weight', 'model.layers.12.self_attn.k_proj.weight', 'model.layers.12.self_attn.q_proj.weight', 'model.layers.12.self_attn.v_proj.weight', 'model.layers.13.mlp.gate_proj.weight', 'model.layers.13.mlp.up_proj.weight', 'model.layers.13.self_attn.k_proj.weight', 'model.layers.13.self_attn.q_proj.weight', 'model.layers.13.self_attn.v_proj.weight', 'model.layers.14.mlp.gate_proj.weight', 'model.layers.14.mlp.up_proj.weight', 'model.layers.14.self_attn.k_proj.weight', 'model.layers.14.self_attn.q_proj.weight', 'model.layers.14.self_attn.v_proj.weight', 'model.layers.15.mlp.gate_proj.weight', 'model.layers.15.mlp.up_proj.weight', 'model.layers.15.self_attn.k_proj.weight', 'model.layers.15.self_attn.q_proj.weight', 'model.layers.15.self_attn.v_proj.weight', 'model.layers.16.mlp.gate_proj.weight', 'model.layers.16.mlp.up_proj.weight', 'model.layers.16.self_attn.k_proj.weight', 'model.layers.16.self_attn.q_proj.weight', 'model.layers.16.self_attn.v_proj.weight', 'model.layers.17.mlp.gate_proj.weight', 'model.layers.17.mlp.up_proj.weight', 'model.layers.17.self_attn.k_proj.weight', 'model.layers.17.self_attn.q_proj.weight', 'model.layers.17.self_attn.v_proj.weight', 'model.layers.18.mlp.gate_proj.weight', 'model.layers.18.mlp.up_proj.weight', 'model.layers.18.self_attn.k_proj.weight', 'model.layers.18.self_attn.q_proj.weight', 'model.layers.18.self_attn.v_proj.weight', 'model.layers.19.mlp.gate_proj.weight', 'model.layers.19.mlp.up_proj.weight', 'model.layers.19.self_attn.k_proj.weight', 'model.layers.19.self_attn.q_proj.weight', 'model.layers.19.self_attn.v_proj.weight', 'model.layers.2.mlp.gate_proj.weight', 'model.layers.2.mlp.up_proj.weight', 'model.layers.2.self_attn.k_proj.weight', 'model.layers.2.self_attn.q_proj.weight', 'model.layers.2.self_attn.v_proj.weight', 'model.layers.20.mlp.gate_proj.weight', 'model.layers.20.mlp.up_proj.weight', 'model.layers.20.self_attn.k_proj.weight', 'model.layers.20.self_attn.q_proj.weight', 'model.layers.20.self_attn.v_proj.weight', 'model.layers.21.mlp.gate_proj.weight', 'model.layers.21.mlp.up_proj.weight', 'model.layers.21.self_attn.k_proj.weight', 'model.layers.21.self_attn.q_proj.weight', 'model.layers.21.self_attn.v_proj.weight', 'model.layers.22.mlp.gate_proj.weight', 'model.layers.22.mlp.up_proj.weight', 'model.layers.22.self_attn.k_proj.weight', 'model.layers.22.self_attn.q_proj.weight', 'model.layers.22.self_attn.v_proj.weight', 'model.layers.23.mlp.gate_proj.weight', 'model.layers.23.mlp.up_proj.weight', 'model.layers.23.self_attn.k_proj.weight', 'model.layers.23.self_attn.q_proj.weight', 'model.layers.23.self_attn.v_proj.weight', 'model.layers.24.mlp.gate_proj.weight', 'model.layers.24.mlp.up_proj.weight', 'model.layers.24.self_attn.k_proj.weight', 'model.layers.24.self_attn.q_proj.weight', 'model.layers.24.self_attn.v_proj.weight', 'model.layers.25.mlp.gate_proj.weight', 'model.layers.25.mlp.up_proj.weight', 'model.layers.25.self_attn.k_proj.weight', 'model.layers.25.self_attn.q_proj.weight', 'model.layers.25.self_attn.v_proj.weight', 'model.layers.26.mlp.gate_proj.weight', 'model.layers.26.mlp.up_proj.weight', 'model.layers.26.self_attn.k_proj.weight', 'model.layers.26.self_attn.q_proj.weight', 'model.layers.26.self_attn.v_proj.weight', 'model.layers.27.mlp.gate_proj.weight', 'model.layers.27.mlp.up_proj.weight', 'model.layers.27.self_attn.k_proj.weight', 'model.layers.27.self_attn.q_proj.weight', 'model.layers.27.self_attn.v_proj.weight', 'model.layers.28.mlp.gate_proj.weight', 'model.layers.28.mlp.up_proj.weight', 'model.layers.28.self_attn.k_proj.weight', 'model.layers.28.self_attn.q_proj.weight', 'model.layers.28.self_attn.v_proj.weight', 'model.layers.29.mlp.gate_proj.weight', 'model.layers.29.mlp.up_proj.weight', 'model.layers.29.self_attn.k_proj.weight', 'model.layers.29.self_attn.q_proj.weight', 'model.layers.29.self_attn.v_proj.weight', 'model.layers.3.mlp.gate_proj.weight', 'model.layers.3.mlp.up_proj.weight', 'model.layers.3.self_attn.k_proj.weight', 'model.layers.3.self_attn.q_proj.weight', 'model.layers.3.self_attn.v_proj.weight', 'model.layers.30.mlp.gate_proj.weight', 'model.layers.30.mlp.up_proj.weight', 'model.layers.30.self_attn.k_proj.weight', 'model.layers.30.self_attn.q_proj.weight', 'model.layers.30.self_attn.v_proj.weight', 'model.layers.31.mlp.gate_proj.weight', 'model.layers.31.mlp.up_proj.weight', 'model.layers.31.self_attn.k_proj.weight', 'model.layers.31.self_attn.q_proj.weight', 'model.layers.31.self_attn.v_proj.weight', 'model.layers.4.mlp.gate_proj.weight', 'model.layers.4.mlp.up_proj.weight', 'model.layers.4.self_attn.k_proj.weight', 'model.layers.4.self_attn.q_proj.weight', 'model.layers.4.self_attn.v_proj.weight', 'model.layers.5.mlp.gate_proj.weight', 'model.layers.5.mlp.up_proj.weight', 'model.layers.5.self_attn.k_proj.weight', 'model.layers.5.self_attn.q_proj.weight', 'model.layers.5.self_attn.v_proj.weight', 'model.layers.6.mlp.gate_proj.weight', 'model.layers.6.mlp.up_proj.weight', 'model.layers.6.self_attn.k_proj.weight', 'model.layers.6.self_attn.q_proj.weight', 'model.layers.6.self_attn.v_proj.weight', 'model.layers.7.mlp.gate_proj.weight', 'model.layers.7.mlp.up_proj.weight', 'model.layers.7.self_attn.k_proj.weight', 'model.layers.7.self_attn.q_proj.weight', 'model.layers.7.self_attn.v_proj.weight', 'model.layers.8.mlp.gate_proj.weight', 'model.layers.8.mlp.up_proj.weight', 'model.layers.8.self_attn.k_proj.weight', 'model.layers.8.self_attn.q_proj.weight', 'model.layers.8.self_attn.v_proj.weight', 'model.layers.9.mlp.gate_proj.weight', 'model.layers.9.mlp.up_proj.weight', 'model.layers.9.self_attn.k_proj.weight', 'model.layers.9.self_attn.q_proj.weight', 'model.layers.9.self_attn.v_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
/data22/llava/lib/python3.10/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()

@hellangleZ
Copy link
Author

@hellangleZ ,

Can you try and see if this solves the issue?

cd LLaVA
export PYTHONPATH="./:$PYTHONPATH"

After check the environment variable command , nothing happend

image

@hellangleZ
Copy link
Author

@hellangleZ ,

Can you try and see if this solves the issue?

cd LLaVA
export PYTHONPATH="./:$PYTHONPATH"

I can share my enviroment
python 3.10.14
torch 2.1.2
torchvision 0.16.2

ninja 1.11.1.1
numpy 1.26.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-ml-py 12.535.108
nvidia-nccl-cu12 2.18.1
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu12 12.1.105

deepspeed 0.14.2

transformers 4.41.0.dev0
triton 2.1.0

tokenizers 0.19.1

I think above is related to the project

@hanoonaR
Copy link
Member

Hi @hellangleZ ,

Please run the following command:

cd LLaVA
export PYTHONPATH="./:$PYTHONPATH"

Then run bash LLaMA3-V_pretrain.sh or your training command.


If this does not solve the problem - Please provide the detailed steps you have followed to run the training (from cloning till running the script). Thank you.

@hellangleZ
Copy link
Author

It should be the LLava still touch the llava_llama and LlavallamaforCausalLM the two method, but after me try to find and change the source code to let it find the PhiforCausualLM , it still not work.

@hellangleZ
Copy link
Author

Hi @hellangleZ ,

Please run the following command:

cd LLaVA
export PYTHONPATH="./:$PYTHONPATH"

Then run bash LLaMA3-V_pretrain.sh or your training command.

If this does not solve the problem - Please provide the detailed steps you have followed to run the training (from cloning till running the script). Thank you.

Still same

1- git clone https://github.com/mbzuai-oryx/LLaVA-pp.git
cd LLaVA-pp
git submodule update --init --recursive
2- pip install git+https://github.com/huggingface/transformers@a98c41798cf6ed99e1ff17e3792d6e06a2ff2ff3
3-

cp Phi-3-V/train.py LLaVA/llava/train/train.py
cp Phi-3-V/llava_phi3.py LLaVA/llava/model/language_model/llava_phi3.py
cp Phi-3-V/builder.py LLaVA/llava/model/builder.py
cp Phi-3-V/model__init__.py LLaVA/llava/model/init.py
cp Phi-3-V/main__init__.py LLaVA/llava/init.py
cp Phi-3-V/conversation.py LLaVA/llava/conversation.py

cp scripts/Phi3-V_pretrain.sh LLaVA/Vi-phi3_pretrain.sh
cp scripts/Phi3-V_finetune_lora.sh LLaVA/Vi-phi3_finetune_lora.sh

4-
image

All the step follow the step by step which provide by the project

@Luo-Z13
Copy link

Luo-Z13 commented Apr 30, 2024

Hi @Luo-Z13 ,

Based on your description, it sounds like there might be a configuration issue with the tokenizer or model.

Here are a few steps to resolve the issue:

  1. Ensure that you are using the correct conversation template, specifically conv_llama3 by setting --verion=llama3, as mismatches in templates can lead to tokenization issues.
  2. Confirm that you are using the correct model meta-llama/Meta-Llama-3-8B-Instruct for this particular task. Using an incompatible model can also result in tokenization mismatches.
  3. Could you please verify if setting your tokenizers - 0.19.1 and and transformers - 4.41.0.dev0 libraries specifically to versions respectively, resolves the issue? Sometimes, even minor version differences can cause unexpected behavior.

Please try these suggestions and let us know if the problem persists.

Thank you for your suggestions, for the correct conversation template, I keep --version llama3 in the script, do I need to change the conversation in the instruction-tuning dataset? Now I keep the LLaVA-Instruction format as:

[ { "from": "human", 
"value": "<image>\n...k?" }, 
{ "from": "gpt", "value": "To ...." } ]

Do I need to change this format?

@hellangleZ
Copy link
Author

@hellangleZ ,

Can you try and see if this solves the issue?

cd LLaVA
export PYTHONPATH="./:$PYTHONPATH"

@hanoonaR

Could you all someone could help track the new code , like train.py

I saw the method train still choose the LlavaLLamaforCausalLM, Am I right?

image

@hanoonaR
Copy link
Member

Hi @hellangleZ ,

You are right. We have just made a commit to fix the issue. Please check this. Thank you for bringing this to our attention. Apologies for the inconvenience and oversight.

@hanoonaR
Copy link
Member

Hi @Luo-Z13 ,
Based on your description, it sounds like there might be a configuration issue with the tokenizer or model.
Here are a few steps to resolve the issue:

  1. Ensure that you are using the correct conversation template, specifically conv_llama3 by setting --verion=llama3, as mismatches in templates can lead to tokenization issues.
  2. Confirm that you are using the correct model meta-llama/Meta-Llama-3-8B-Instruct for this particular task. Using an incompatible model can also result in tokenization mismatches.
  3. Could you please verify if setting your tokenizers - 0.19.1 and and transformers - 4.41.0.dev0 libraries specifically to versions respectively, resolves the issue? Sometimes, even minor version differences can cause unexpected behavior.

Please try these suggestions and let us know if the problem persists.

Thank you for your suggestions, for the correct conversation template, I keep --version llama3 in the script, do I need to change the conversation in the instruction-tuning dataset? Now I keep the LLaVA-Instruction format as:

[ { "from": "human", 
"value": "<image>\n...k?" }, 
{ "from": "gpt", "value": "To ...." } ]

Do I need to change this format?

No, you do not need to change the format of the instructions - Using the --version llama3 is sufficient. Please let us know if this solves, or the transformer and tokenizer version solves this issue.

@hellangleZ
Copy link
Author

Hi @hellangleZ ,

You are right. We have just made a commit to fix the issue. Please check this. Thank you for bringing this to our attention. Apologies for the inconvenience and oversight.

Hi, @hanoonaR

only change it ,could not fulfill yet, I just test again, it maybe something python file, still use the old method, but I can't find it, please still help to check and fix

Thanks

@Luo-Z13
Copy link

Luo-Z13 commented Apr 30, 2024

Hi @Luo-Z13 ,
Based on your description, it sounds like there might be a configuration issue with the tokenizer or model.
Here are a few steps to resolve the issue:

  1. Ensure that you are using the correct conversation template, specifically conv_llama3 by setting --verion=llama3, as mismatches in templates can lead to tokenization issues.
  2. Confirm that you are using the correct model meta-llama/Meta-Llama-3-8B-Instruct for this particular task. Using an incompatible model can also result in tokenization mismatches.
  3. Could you please verify if setting your tokenizers - 0.19.1 and and transformers - 4.41.0.dev0 libraries specifically to versions respectively, resolves the issue? Sometimes, even minor version differences can cause unexpected behavior.

Please try these suggestions and let us know if the problem persists.

Thank you for your suggestions, for the correct conversation template, I keep --version llama3 in the script, do I need to change the conversation in the instruction-tuning dataset? Now I keep the LLaVA-Instruction format as:

[ { "from": "human", 
"value": "<image>\n...k?" }, 
{ "from": "gpt", "value": "To ...." } ]

Do I need to change this format?

No, you do not need to change the format of the instructions - Using the --version llama3 is sufficient. Please let us know if this solves, or the transformer and tokenizer version solves this issue.

Ok, it seems that adjusting the versions of the transformer and tokenizer doesn't solve this issue. And there were warnings when I installed them as follows:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
llava 1.2.2.post1 requires accelerate==0.21.0, but you have accelerate 0.27.2 which is incompatible.
llava 1.2.2.post1 requires tokenizers==0.15.1, but you have tokenizers 0.19.1 which is incompatible.
llava 1.2.2.post1 requires transformers==4.37.2, but you have transformers 4.41.0.dev0 which is incompatible.

@Luo-Z13
Copy link

Luo-Z13 commented Apr 30, 2024

Hi @Luo-Z13 ,
Based on your description, it sounds like there might be a configuration issue with the tokenizer or model.
Here are a few steps to resolve the issue:

  1. Ensure that you are using the correct conversation template, specifically conv_llama3 by setting --verion=llama3, as mismatches in templates can lead to tokenization issues.
  2. Confirm that you are using the correct model meta-llama/Meta-Llama-3-8B-Instruct for this particular task. Using an incompatible model can also result in tokenization mismatches.
  3. Could you please verify if setting your tokenizers - 0.19.1 and and transformers - 4.41.0.dev0 libraries specifically to versions respectively, resolves the issue? Sometimes, even minor version differences can cause unexpected behavior.

Please try these suggestions and let us know if the problem persists.

Thank you for your suggestions, for the correct conversation template, I keep --version llama3 in the script, do I need to change the conversation in the instruction-tuning dataset? Now I keep the LLaVA-Instruction format as:

[ { "from": "human", 
"value": "<image>\n...k?" }, 
{ "from": "gpt", "value": "To ...." } ]

Do I need to change this format?

No, you do not need to change the format of the instructions - Using the --version llama3 is sufficient. Please let us know if this solves, or the transformer and tokenizer version solves this issue.

Ok, it seems that adjusting the versions of the transformer and tokenizer doesn't solve this issue. And there were warnings when I installed them as follows:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
llava 1.2.2.post1 requires accelerate==0.21.0, but you have accelerate 0.27.2 which is incompatible.
llava 1.2.2.post1 requires tokenizers==0.15.1, but you have tokenizers 0.19.1 which is incompatible.
llava 1.2.2.post1 requires transformers==4.37.2, but you have transformers 4.41.0.dev0 which is incompatible.

Would setting use_fast=True solve this issue? Like haotian-liu/LLaVA#661 (comment)

@mmaaz60
Copy link
Member

mmaaz60 commented Apr 30, 2024

Hi Everyone,

Please refer to the comment at #8 (comment). This will be helpful. Thank You and Good Luck !

@hellangleZ
Copy link
Author

Hi Everyone,

Please refer to the comment at #8 (comment). This will be helpful. Thank You and Good Luck !

Hi @mmaaz60 and @hanoonaR

Thanks for support

It's working good now

@mmaaz60 mmaaz60 closed this as completed May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants