Ddp #32

LauraGPT · 2024-01-25T12:56:43Z

What does this PR do?

Fixes # (issue)

Feature/Issue validation/testing

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

Test A
Logs for Test A
Test B
Logs for Test B

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Thanks for contributing 🎉!

update with main

ddlBoJack · 2024-01-25T14:45:26Z

scripts/conf/asr_vicuna_lora.yaml

+
+train_config:
+  model_name: "PATH/to/LLAMA/7B"
+  enable_fsdp: false


enable_ddp: false

ddlBoJack · 2024-01-25T14:57:03Z

scripts/finetune_asr_vicuna.sh

++train_config.lr=1e-4 \
++train_config.output_dir=$output_dir \
++train_config.peft_config.peft_method=lora \
++metric=acc \


metric to be classified to a certain class

ddlBoJack · 2024-01-25T14:57:18Z

scripts/finetune_asr_vicuna.sh

++train_config.enable_fsdp=true \
++train_config.enable_ddp=false \
++train_config.use_fp16=true \
++metric=acc \


metric to be classified to a certain class

ddlBoJack · 2024-01-25T15:08:41Z

scripts/conf/asr_vicuna_lora.yaml

+    use_fp16: false
+    # sharding_strategy: "FULL_SHARD" #ShardingStrategy = ShardingStrategy.FULL_SHARD
+    sharding_strategy: "NO_SHARD" #ShardingStrategy.NO_SHARD #MZY: set NO_SHARD to use DDP mode in FSDP
+    checkpoint_type: "StateDictType.SHARDED_STATE_DICT"  # alternatively can use SHARDED_STATE_DICT save one file per rank, and can resize the world-size.


To unify with only the str name, say, "SHARDED_STATE_DICT"

ddlBoJack · 2024-01-25T15:38:09Z

src/llama_recipes/utils/train_utils.py

@@ -229,12 +233,12 @@ def train(model, train_dataloader,eval_dataloader, tokenizer, optimizer, lr_sche
                            )

                else:
-                    if not train_config.use_peft and fsdp_config.checkpoint_type == StateDictType.FULL_STATE_DICT:
+                    if not train_config.use_peft and fsdp_config.checkpoint_type == "StateDictType.FULL_STATE_DICT":


ddlBoJack · 2024-01-25T15:38:18Z

src/llama_recipes/utils/train_utils.py


                        save_model_checkpoint(
                            model, optimizer, rank, train_config, epoch=epoch
                        )
-                    elif not train_config.use_peft and fsdp_config.checkpoint_type == StateDictType.SHARDED_STATE_DICT:
+                    elif not train_config.use_peft and fsdp_config.checkpoint_type == "StateDictType.SHARDED_STATE_DICT":


LauraGPT added 10 commits January 16, 2024 10:40

Merge pull request #30 from ddlBoJack/main

27e9579

update with main

hydra

f8fdb3d

hydra

ee3fe7a

hydra

c0ecb3b

hydra

f4fcc15

ddp

9b4676a

ddp

7fc4437

ddp

9e0d9e1

hydra

bfd08c2

ddp

531d2d7

ddlBoJack reviewed Jan 25, 2024

View reviewed changes

scripts/conf/asr_vicuna_lora.yaml

train_config:

model_name: "PATH/to/LLAMA/7B"

enable_fsdp: false

Copy link

Collaborator

ddlBoJack Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enable_ddp: false

ddlBoJack reviewed Jan 25, 2024

View reviewed changes

ddlBoJack approved these changes Jan 25, 2024

View reviewed changes

ddlBoJack merged commit 6d30313 into main Jan 25, 2024
2 of 4 checks passed

LauraGPT deleted the ddp branch February 4, 2024 03:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ddp #32

Ddp #32

LauraGPT commented Jan 25, 2024

ddlBoJack Jan 25, 2024

ddlBoJack Jan 25, 2024

ddlBoJack Jan 25, 2024

ddlBoJack Jan 25, 2024

ddlBoJack Jan 25, 2024

ddlBoJack Jan 25, 2024

Ddp #32

Ddp #32

Conversation

LauraGPT commented Jan 25, 2024

What does this PR do?

Feature/Issue validation/testing

Before submitting

ddlBoJack Jan 25, 2024

Choose a reason for hiding this comment

ddlBoJack Jan 25, 2024

Choose a reason for hiding this comment

ddlBoJack Jan 25, 2024

Choose a reason for hiding this comment

ddlBoJack Jan 25, 2024

Choose a reason for hiding this comment

ddlBoJack Jan 25, 2024

Choose a reason for hiding this comment

ddlBoJack Jan 25, 2024

Choose a reason for hiding this comment