You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
when precompute_ref_log_probs=True, reference_chosen_logps and reference_rejected_logps was not saved to self.eval_dataset. When ref_model=None, subsequent evaluations will use self.model to recalculate, resulting in eval/acc
is always zero (because the policy and reference are using the same model).
when
precompute_ref_log_probs=True
,reference_chosen_logps
andreference_rejected_logps
was not saved toself.eval_dataset
. Whenref_model=None
, subsequent evaluations will useself.model
to recalculate, resulting in eval/accis always zero (because the policy and reference are using the same model).
trl/trl/trainer/dpo_trainer.py
Line 448 in d708ec2
Perhaps it should be modified like this:
The text was updated successfully, but these errors were encountered: