-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Language Modelling Example #30446
Comments
Hi @ajherman, can you share your running environment? Run I'm able to run the example from the README without issue on main:
|
For the last two points: Yes, I'm using a GPU. I should mention additionally, that I want to run this script with SLURM, but I get the same error whether I run through an SBATCH script or just run the command directly in the terminal. |
@ajherman Are you able to run the example I shared i.e. one with a common dataset? |
Ah, yes, the example script you provided works! Or at the very least, it begins the training loop (it hasn't gotten past 0%, but at least I'm not getting the key error anymore). What does this mean...? |
Are you by any change passing There's a few places where this is wrong in our docs. I've opened #30480 to fix these |
Thank you! Yes, this appears to have solved the problem. As a note, the confusion came from the end of the examples/pytorch/language-modelling/README.md file. In the example command, it passes openai-community/gpt2 as the --model_type argument. I changed it to gpt2, and it now seems to work. The same command passes the same string to --model_tokenizer, but I did not have to modify that one. Could you give me some insight into how the naming is intended to work? What is the implied meaning of including or not including something like "openai-community" before the model name? Thanks for your help! |
GPT2 is a bit special. I'll use another model first to explain the general case, and then explain why GPT2 different. When loading a model using a checkpoint e.g. Another this to note is the difference between a checkpoint and a model type. In the script Many years ago, when some models were added to to Hugging Face, we didn't have this standard, and just the model name was used e.g. Now, to avoid breaking existing scripts and code, we maintain backwards compatibility and the old checkpoints ( So, you can't change |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
System Info
I'm running the first example in the examples/pytorch/language-modelling/README.md. I have installed the requirements.txt. I consistently get the following error (sorry for the screen grab, I have issues with copy/paste in this environment):
I don't understand why I am getting this key error from openai-community/gpt2. Incidentally, when I try run a script in the same environment that simply has the line model = from_pretrained('openai-community/gpt2'), I do not get any errors. So, it seems like it must be an issue in the example code? That is, the run_clm.py.
Who can help?
ajherman
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
python run_clm.py
--model_name_or_path openai-community/gpt2
--train_file path_to_train_file
--validation_file path_to_validation_file
--per_device_train_batch_size 8
--per_device_eval_batch_size 8
--do_train
--do_eval
--output_dir /tmp/test-clm
Expected behavior
Should run the example script...
The text was updated successfully, but these errors were encountered: