-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dpo_trainer] Fixed a compatibility bug with deepspeed when initializing reference_model #1123
Conversation
…ing reference_model
nice catch @Emperorizzis checking! |
@Emperorizzis can you also share your deepspeed config to make sure I can reproduce it? thanks! |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Of Course ~ Below is my deepspeed config: {
"bfloat16": {
"enabled": true
},
"fp16": {
"enabled": false,
"loss_scale": 0,
"loss_scale_window": 1000,
"initial_scale_power": 16,
"hysteresis": 2,
"min_loss_scale": 1
},
"optimizer": {
"type": "AdamW",
"params": {
"lr": "auto",
"weight_decay": "auto",
"betas": "auto",
"eps": "auto",
"torch_adam": true,
"adam_w_mode": true
}
},
"scheduler": {
"type": "WarmupDecayLR",
"params": {
"warmup_min_lr": 1e-7,
"warmup_max_lr": 1e-6,
"warmup_num_steps": "auto",
"total_num_steps": "auto"
}
},
"zero_optimization": {
"stage": 3,
"overlap_comm": true,
"contiguous_gradients": true,
"sub_group_size": 1e12,
"reduce_bucket_size": "auto",
"stage3_prefetch_bucket_size": "auto",
"stage3_param_persistence_threshold": "auto",
"stage3_max_live_parameters": 1e9,
"stage3_max_reuse_distance": 1e9,
"stage3_gather_16bit_weights_on_model_save": true
},
"gradient_accumulation_steps": "auto",
"gradient_clipping": "auto",
"steps_per_print": 1e5,
"train_batch_size": "auto",
"train_micro_batch_size_per_gpu": "auto",
"wall_clock_breakdown": false
} Also, to save memory, I loaded the model and reference_model after initializing the TrainingArguments (which automatically selected the deepspeed backend). |
thanks @Emperorizzis so one issue is that the Deepspeed 3 together with pre-computing the ref log-probs does not work as the private HF Trainer methods initialize the dataloader before the model etc. and we currently raise an error: https://github.com/huggingface/trl/blob/main/trl/trainer/dpo_trainer.py#L358-L36 |
Thanks for replying ~ So the reference_model does need to be initialized from different deepspeed configs. (At least remove the superfluous data config) |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
So does this issue remain unsolved in the latest version 0.7.11? I met the same error using the example script dpo.py. |
When training a model using DeepSpeed + DPO (with Warmup), the following error occurs:
(This issue has also been mentioned in #955).
File "xxx/deepspeed/runtime/lr_schedules.py", line 661, in __init__ self.warmup_num_steps = max(2, warmup_num_steps) TypeError: '>' not supported between instances of 'str' and 'int'
The dpo_trainer initializes both the model and the reference_model using the same deepspeed config. Training hyperparameters are set through TrainingArguments & Trainer, but the reference_model only requires deepspeed for forward computation. Scheduler-related parameters not only cause issues but are also redundant, they should be removed.