You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a model with a similar architecture to the LLaMA model, which has been modified based on LLaMA by altering the multi-head attention mechanism for QKV. now I want to use FSDP(deepspeed zero3) and torch.compile,and I found torchtune has the two functionalities.so I want to add my model to the torchtune framework. since I see torchtune has support Qwen、llama、phi, how can I adjust my modeling.py to use torchtune? is there any API or Interface?
No description provided.
The text was updated successfully, but these errors were encountered: