Add FSDP config for CPU RAM efficient loading through accelerate #30002

helloworld1 · 2024-04-02T16:35:15Z

What does this PR do?

Currently the environment variable FSDP_CPU_RAM_EFFICIENT_LOADING is being read here but are set in transformers codebase. This change added option to set FSDP_CPU_RAM_EFFICIENT_LOADING through cpu_ram_efficient_loading FSDP option so jobs launched from torchrun or other means can take advantage of FSDP_CPU_RAM_EFFICIENT_LOADING through configs.

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@pacman100
@muellerzr
@ArthurZucker

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

HuggingFaceDocBuilderDev · 2024-04-15T17:14:52Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

muellerzr

Thanks! Overall this looks much better to me, just a slight documentation nit. cc @pacman100 so you're aware

src/transformers/training_args.py

helloworld1 · 2024-04-18T16:10:01Z

Thanks for the suggestion. @muellerzr would you re-approve?

muellerzr · 2024-04-18T16:15:31Z

No need for a reapproval in those cases :) The green checkmark in transformers is end-all-be-all approval unless something radically changed outside that. cc @amyeroberts for final review

amyeroberts

Thanks for adding!

Just a small nit and request for input validation

src/transformers/training_args.py

pacman100

Thank you @helloworld1 for adding this!

Co-authored-by: Zach Mueller <[email protected]>

Co-authored-by: amyeroberts <[email protected]>

amyeroberts

Thanks for adding and iterating - looks great!

src/transformers/training_args.py

amyeroberts · 2024-04-19T08:54:06Z

Running make fixup should resolve the code quality checks

Co-authored-by: amyeroberts <[email protected]>

…gingface#30002) * Add FSDP config for CPU RAM efficient loading * Style fix * Update src/transformers/training_args.py Co-authored-by: Zach Mueller <[email protected]> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <[email protected]> * Add sync_module_states and cpu_ram_efficient_loading validation logic * Update src/transformers/training_args.py Co-authored-by: amyeroberts <[email protected]> * Style --------- Co-authored-by: Zach Mueller <[email protected]> Co-authored-by: amyeroberts <[email protected]>

) * Add FSDP config for CPU RAM efficient loading * Style fix * Update src/transformers/training_args.py Co-authored-by: Zach Mueller <[email protected]> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <[email protected]> * Add sync_module_states and cpu_ram_efficient_loading validation logic * Update src/transformers/training_args.py Co-authored-by: amyeroberts <[email protected]> * Style --------- Co-authored-by: Zach Mueller <[email protected]> Co-authored-by: amyeroberts <[email protected]>

helloworld1 mentioned this pull request Apr 2, 2024

Fix is_fsdp_enabled() logic and make low_cpu_mem_usage by default #29961

Closed

5 tasks

helloworld1 force-pushed the helloworld1/cpu_ram_efficient_loading branch from 68033fa to 8719a0f Compare April 4, 2024 20:52

helloworld1 force-pushed the helloworld1/cpu_ram_efficient_loading branch 2 times, most recently from cf40b24 to 5443e5b Compare April 15, 2024 16:38

ArthurZucker requested a review from muellerzr April 18, 2024 09:17

muellerzr approved these changes Apr 18, 2024

View reviewed changes

src/transformers/training_args.py Outdated Show resolved Hide resolved

muellerzr requested a review from amyeroberts April 18, 2024 16:15

amyeroberts reviewed Apr 18, 2024

View reviewed changes

src/transformers/training_args.py Outdated Show resolved Hide resolved

src/transformers/training_args.py Outdated Show resolved Hide resolved

pacman100 approved these changes Apr 18, 2024

View reviewed changes

helloworld1 and others added 5 commits April 18, 2024 21:43

Add FSDP config for CPU RAM efficient loading

2e2452a

Style fix

b2f10c1

Update src/transformers/training_args.py

3dddc8c

Co-authored-by: Zach Mueller <[email protected]>

Update src/transformers/training_args.py

48b2fc4

Co-authored-by: amyeroberts <[email protected]>

Add sync_module_states and cpu_ram_efficient_loading validation logic

e81cd98

helloworld1 force-pushed the helloworld1/cpu_ram_efficient_loading branch from e7a1c8f to e81cd98 Compare April 19, 2024 04:52

amyeroberts approved these changes Apr 19, 2024

View reviewed changes

src/transformers/training_args.py Outdated Show resolved Hide resolved

helloworld1 and others added 2 commits April 19, 2024 11:30

Update src/transformers/training_args.py

579a036

Co-authored-by: amyeroberts <[email protected]>

Style

a2978fb

helloworld1 force-pushed the helloworld1/cpu_ram_efficient_loading branch from 73d8e4a to a2978fb Compare April 19, 2024 18:50

amyeroberts merged commit f16caf4 into huggingface:main Apr 22, 2024
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add FSDP config for CPU RAM efficient loading through accelerate #30002

Add FSDP config for CPU RAM efficient loading through accelerate #30002

helloworld1 commented Apr 2, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 15, 2024

muellerzr left a comment

helloworld1 commented Apr 18, 2024

muellerzr commented Apr 18, 2024

amyeroberts left a comment

pacman100 left a comment

amyeroberts left a comment

amyeroberts commented Apr 19, 2024

Add FSDP config for CPU RAM efficient loading through accelerate #30002

Add FSDP config for CPU RAM efficient loading through accelerate #30002

Conversation

helloworld1 commented Apr 2, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Apr 15, 2024

muellerzr left a comment

Choose a reason for hiding this comment

helloworld1 commented Apr 18, 2024

muellerzr commented Apr 18, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

pacman100 left a comment

Choose a reason for hiding this comment

amyeroberts left a comment

Choose a reason for hiding this comment

amyeroberts commented Apr 19, 2024

helloworld1 commented Apr 2, 2024 •

edited

Loading