-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using phi-3 and LLava but some fields of phi3 network not support #7
Comments
Hi @hellangleZ , Regarding the warnings and errors you're encountering:
Please double-check these details, and if the issue persists, we can investigate further. |
Hello @hanoonaR , there are tokenization mismatch warnings when finetuning LLaMA3-V:
with |
Hi @hanoonaR And Actually my init is already support LLavaphi |
Hi @Luo-Z13 , Based on your description, it sounds like there might be a configuration issue with the tokenizer or model. Here are a few steps to resolve the issue:
Please try these suggestions and let us know if the problem persists. |
Can you please show the error message - not just the parameter names, please. What does the error message say in "Some weights of ..." |
Can you try and see if this solves the issue?
|
Hi @hanoonaR You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. |
After check the environment variable command , nothing happend |
I can share my enviroment ninja 1.11.1.1 deepspeed 0.14.2 transformers 4.41.0.dev0 tokenizers 0.19.1 I think above is related to the project |
Hi @hellangleZ , Please run the following command:
Then run If this does not solve the problem - Please provide the detailed steps you have followed to run the training (from cloning till running the script). Thank you. |
It should be the LLava still touch the llava_llama and LlavallamaforCausalLM the two method, but after me try to find and change the source code to let it find the PhiforCausualLM , it still not work. |
Still same 1- git clone https://github.com/mbzuai-oryx/LLaVA-pp.git cp Phi-3-V/train.py LLaVA/llava/train/train.py cp scripts/Phi3-V_pretrain.sh LLaVA/Vi-phi3_pretrain.sh All the step follow the step by step which provide by the project |
Thank you for your suggestions, for the correct conversation template, I keep
Do I need to change this format? |
Could you all someone could help track the new code , like train.py I saw the method train still choose the LlavaLLamaforCausalLM, Am I right? |
Hi @hellangleZ , You are right. We have just made a commit to fix the issue. Please check this. Thank you for bringing this to our attention. Apologies for the inconvenience and oversight. |
No, you do not need to change the format of the instructions - Using the |
Hi, @hanoonaR only change it ,could not fulfill yet, I just test again, it maybe something python file, still use the old method, but I can't find it, please still help to check and fix Thanks |
Ok, it seems that adjusting the versions of the transformer and tokenizer doesn't solve this issue. And there were warnings when I installed them as follows:
|
Would setting |
Hi Everyone, Please refer to the comment at #8 (comment). This will be helpful. Thank You and Good Luck ! |
Thanks for support It's working good now |
Hi team:
Every steps are copied your wizard to do
Like same dataset, and use the newest repo also upload llava to neweast
Also copy every new python files to Llava folder
But, still report this error alert
It only only occur at pretrain train process, also in FT process, and after FT, the model should not be used.
Myscrpt:
#!/bin/bash
deepspeed llava/train/train_mem.py
--deepspeed ./scripts/zero2.json
--model_name_or_path /data2/phi3-instruct/
--version plain
--data_path ./playground/data/blip_laion_cc_sbu_558k.json
--image_folder ./playground/data/images
--vision_tower openai/clip-vit-large-patch14-336
--mm_projector_type mlp2x_gelu
--tune_mm_mlp_adapter True
--mm_vision_select_layer -2
--mm_use_im_start_end False
--mm_use_im_patch_token False
--bf16 True
--output_dir ./checkpoints/llava-v1.5-phi3-mini-pretrain_v2
--num_train_epochs 1
--per_device_train_batch_size 32
--per_device_eval_batch_size 4
--gradient_accumulation_steps 1
--evaluation_strategy "no"
--save_strategy "steps"
--save_steps 24000
--save_total_limit 1
--learning_rate 1e-3
--weight_decay 0.
--warmup_ratio 0.03
--lr_scheduler_type "cosine"
--logging_steps 1
--tf32 True
--model_max_length 2048
--gradient_checkpointing True
--dataloader_num_workers 4
--lazy_preprocess True
--report_to "tensorboard"
The text was updated successfully, but these errors were encountered: