-
-
Notifications
You must be signed in to change notification settings - Fork 913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FSDP+LORA on multiple gpu(A100 80gb*4) ValueError: Cannot flatten integer dtype tensors #2250
Comments
Can you post the stack trace so we can see what is throwing the error? I had a similar problem earlier this week and depending on where it is coming from I either had to set gradient_accumulation_steps to 1 or turn off the liger cross entropy kernel |
super().init(*_args, **kwargs) I try your solution but still same issue |
Please check that this issue hasn't been reported before.
Expected Behavior
The LoRA configuration should work with fsdp
Current behaviour
[rank0]: raise ValueError("Cannot flatten integer dtype tensors")
[rank0]: ValueError: Cannot flatten integer dtype tensors
[rank1]: Traceback (most recent call last):
Steps to reproduce
Run Axolotl on multiple GPUs using LoRA+ FSDP, 4 NVIDIA A100 GPUs with 80GB
-torch Version: 2.5.1
-axolotl Version: 0.6.0
Config yaml
Possible solution
No response
Which Operating Systems are you using?
Python Version
3.10.13
axolotl branch-commit
main/latest
Acknowledgements
The text was updated successfully, but these errors were encountered: