[dpo_trainer] Fixed a compatibility bug with deepspeed when initializing reference_model #1123

Emperorizzis · 2023-12-21T12:27:18Z

When training a model using DeepSpeed + DPO (with Warmup), the following error occurs:
(This issue has also been mentioned in #955).

File "xxx/deepspeed/runtime/lr_schedules.py", line 661, in __init__ self.warmup_num_steps = max(2, warmup_num_steps) TypeError: '>' not supported between instances of 'str' and 'int'

The dpo_trainer initializes both the model and the reference_model using the same deepspeed config. Training hyperparameters are set through TrainingArguments & Trainer, but the reference_model only requires deepspeed for forward computation. Scheduler-related parameters not only cause issues but are also redundant, they should be removed.

…ing reference_model

kashif · 2023-12-21T15:42:55Z

nice catch @Emperorizzis checking!

kashif · 2023-12-21T16:12:51Z

@Emperorizzis can you also share your deepspeed config to make sure I can reproduce it? thanks!

HuggingFaceDocBuilderDev · 2023-12-21T16:34:14Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Emperorizzis · 2023-12-22T01:51:51Z

@Emperorizzis can you also share your deepspeed config to make sure I can reproduce it? thanks!

Of Course ~

Below is my deepspeed config:

{
  "bfloat16": {
      "enabled": true
  },
  "fp16": {
      "enabled": false,
      "loss_scale": 0,
      "loss_scale_window": 1000,
      "initial_scale_power": 16,
      "hysteresis": 2,
      "min_loss_scale": 1
  },
  "optimizer": {
      "type": "AdamW",
      "params": {
          "lr": "auto",
          "weight_decay": "auto",
          "betas": "auto",
          "eps": "auto",
          "torch_adam": true,
          "adam_w_mode": true
      }
  },
  "scheduler": {
      "type": "WarmupDecayLR",
      "params": {
          "warmup_min_lr": 1e-7,
          "warmup_max_lr": 1e-6,
          "warmup_num_steps": "auto",
          "total_num_steps": "auto"
      }
  },
  "zero_optimization": {
      "stage": 3,
      "overlap_comm": true,
      "contiguous_gradients": true,
      "sub_group_size": 1e12,
      "reduce_bucket_size": "auto",
      "stage3_prefetch_bucket_size": "auto",
      "stage3_param_persistence_threshold": "auto",
      "stage3_max_live_parameters": 1e9,
      "stage3_max_reuse_distance": 1e9,
      "stage3_gather_16bit_weights_on_model_save": true
  },
  "gradient_accumulation_steps": "auto",
  "gradient_clipping": "auto",
  "steps_per_print": 1e5,
  "train_batch_size": "auto",
  "train_micro_batch_size_per_gpu": "auto",
  "wall_clock_breakdown": false
}

Also, to save memory, I loaded the model and reference_model after initializing the TrainingArguments (which automatically selected the deepspeed backend).
(I noticed that the dpo.py script in the example loads the model before initializing the TrainingArguments, which can run out of memory when the model is very large, such as 70B or larger.)

kashif · 2023-12-22T10:01:47Z

thanks @Emperorizzis so one issue is that the Deepspeed 3 together with pre-computing the ref log-probs does not work as the private HF Trainer methods initialize the dataloader before the model etc. and we currently raise an error:

https://github.com/huggingface/trl/blob/main/trl/trainer/dpo_trainer.py#L358-L36

Emperorizzis · 2023-12-22T10:17:15Z

thanks @Emperorizzis so one issue is that the Deepspeed 3 together with pre-computing the ref log-probs does not work as the private HF Trainer methods initialize the dataloader before the model etc. and we currently raise an error:

https://github.com/huggingface/trl/blob/main/trl/trainer/dpo_trainer.py#L358-L36

Thanks for replying ~ So the reference_model does need to be initialized from different deepspeed configs. (At least remove the superfluous data config)

github-actions · 2024-01-20T15:04:56Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

yiranma0 · 2024-03-04T09:19:58Z

So does this issue remain unsolved in the latest version 0.7.11? I met the same error using the example script dpo.py.

[dpo_trainer] Fixed a compatibility bug with deepspeed when initializ…

27b3dae

…ing reference_model

kashif added the 🏋 DPO Related to DPO label Dec 21, 2023

github-actions bot closed this Jan 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dpo_trainer] Fixed a compatibility bug with deepspeed when initializing reference_model #1123

[dpo_trainer] Fixed a compatibility bug with deepspeed when initializing reference_model #1123

Emperorizzis commented Dec 21, 2023

kashif commented Dec 21, 2023

kashif commented Dec 21, 2023

HuggingFaceDocBuilderDev commented Dec 21, 2023

Emperorizzis commented Dec 22, 2023

kashif commented Dec 22, 2023

Emperorizzis commented Dec 22, 2023

github-actions bot commented Jan 20, 2024

yiranma0 commented Mar 4, 2024

[dpo_trainer] Fixed a compatibility bug with deepspeed when initializing reference_model #1123

[dpo_trainer] Fixed a compatibility bug with deepspeed when initializing reference_model #1123

Conversation

Emperorizzis commented Dec 21, 2023

kashif commented Dec 21, 2023

kashif commented Dec 21, 2023

HuggingFaceDocBuilderDev commented Dec 21, 2023

Emperorizzis commented Dec 22, 2023

kashif commented Dec 22, 2023

Emperorizzis commented Dec 22, 2023

github-actions bot commented Jan 20, 2024

yiranma0 commented Mar 4, 2024