-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Direct Policy Optimization #504
Comments
@Reichenbachian I think there is currently a version of DPO under review on the TRL lib if you want to check : |
wonder if there is any updates regarding implementing dpo features in trlx, many thanks! |
There hasn't been any updates regarding that. AFAIK nobody is currently working on it, so you can freely pick it up if you want! |
Hi, is this something that is still open to work on? I would like to pick it up if that is okay :) @CSerxy I've just forked and begun work on this feature, let me know if this conflicts with you |
🚀 The feature, motivation, and pitch
Hey all! Appreciate the work.
Is there any word on whether DPO (direct policy optimization) will be integrated into the trlx library soon?
Alternatives
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: