longvu_ckpt_fixes

Fixes that may help with saving & resuming training with LongVU's codebase under deepspeed

To use deepspeed instead of FSDP (see this discussion on LongVU's repo), remove all FSDP-related arguments that are passed to train.py and instead use --deepspeed $path_to_deepspeed_config. A suggested deepspeed config .json can be found in this repo. Under the assumption that LongVU "inherits" some of the issues related to resuming training from VideoLlaVA (see their github issues for related discussions), I've changed some .py files which can also be found here (diff-ing with the original can show what changes were made).

Note: these fixes worked for me but I take no responsbility whatsoever whether they'll work for your environment as well.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
mm_datautils.py		mm_datautils.py
mm_trainer.py		mm_trainer.py
train.py		train.py
zero2.json		zero2.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

longvu_ckpt_fixes

About

Releases

Packages

Languages

geomlyd/longvu_ckpt_fixes

Folders and files

Latest commit

History

Repository files navigation

longvu_ckpt_fixes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages