-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Train llama 3.1 with GRIT #60
Comments
change the modeling file for llama3.1 so that it could accept is_causal arg and thus influence attention behavior? |
I thought I’m still learning, so please kindly correct me if I’m mistaken. Thank you so much! |
yes you're right; i meant to say that that is the option you have to go with; sorry i should have removed the ? |
Thank you so much! Nah you don't need to remove "the" it's just me don't have much confidence in that :) I've checked the code for
Really appreciate your prompted guidance, even on weekends! You are indeed one of the most helpful and fastest responding authors I have contacted. |
|
I'm now trying to train llama3.1 with GRIT pipeline.
At first I directly change
--model_name_or_path
and run the training code (the training script I used is as follows)But there is an error
TypeError: LlamaModel.forward() got an unexpected keyword argument 'is_causal'
. I looked into it and found several issues regarding this #34, #32 and #19.Just to confirm, if I want to train llama 3.1 model with GRIT, can I just
modeling_gritlm7b.py
into llama3.1 model folderor do I need to
is_causal
arg and thus influence attention behavior?Thank you so much!
The text was updated successfully, but these errors were encountered: