use_fast_dataset=True #2

fyting · 2024-08-19T09:55:08Z

Is the way to use OmniBal in the internvl codebase by adding the use_fast_dataset=True configuration in the bash script? For example, if you add the use_fast_dataset=True configuration in this file: https://github.com/ModelTC/InternVL/blob/OmniBal_V2.0/internvl_chat/shell/internvl1.5/hermes2_yi34b/internvl_chat_v1_5_hermes2_yi34b_dynamic_res_finetune.sh, will it accelerate training?

yqyao · 2024-08-20T07:27:33Z

Yes, you can follow this PR (https://github.com/OpenGVLab/InternVL/pull/506/files#diff-a6d78bf1713c7a9e7c1c701008ac8761ecf7d9d376f56658522ad6a2bda77016), for 6 + 20b training, we can reduce training time from 14.5h to 9.5h with 64 GPUs using vit 9 llm 4096 input. @fyting

fyting · 2024-08-21T03:55:01Z

Yes, you can follow this PR (https://github.com/OpenGVLab/InternVL/pull/506/files#diff-a6d78bf1713c7a9e7c1c701008ac8761ecf7d9d376f56658522ad6a2bda77016), for 6 + 20b training, we can reduce training time from 14.5h to 9.5h with 64 GPUs using vit 9 llm 4096 input. @fyting

Thank you for your guidance. Could you please also provide the sh script used for training?

fyting · 2024-08-30T08:13:31Z

use_fast_dataset=True

After setting use_fast_dataset=True in the config, the training process gets stuck at this point. What could be the issue?

yqyao · 2024-08-31T01:47:35Z

Maybe you can try to insert some breakpoints (pdb) to solve your problem @fyting.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use_fast_dataset=True #2

use_fast_dataset=True #2

fyting commented Aug 19, 2024

yqyao commented Aug 20, 2024 •

edited

Loading

fyting commented Aug 21, 2024

fyting commented Aug 30, 2024

yqyao commented Aug 31, 2024

use_fast_dataset=True #2

use_fast_dataset=True #2

Comments

fyting commented Aug 19, 2024

yqyao commented Aug 20, 2024 • edited Loading

fyting commented Aug 21, 2024

fyting commented Aug 30, 2024

yqyao commented Aug 31, 2024

yqyao commented Aug 20, 2024 •

edited

Loading