Skip to content

Commit

Permalink
add missing classes (#1479)
Browse files Browse the repository at this point in the history
  • Loading branch information
anakin87 authored Mar 24, 2024
1 parent 9ce7ac6 commit dc6a934
Showing 1 changed file with 41 additions and 16 deletions.
57 changes: 41 additions & 16 deletions docs/source/trainer.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,47 @@ At TRL we support PPO (Proximal Policy Optimisation) with an implementation that
The Trainer and model classes are largely inspired from `transformers.Trainer` and `transformers.AutoModel` classes and adapted for RL.
We also support a `RewardTrainer` that can be used to train a reward model.


## CPOConfig

[[autodoc]] CPOConfig

## CPOTrainer

[[autodoc]] CPOTrainer

## DDPOConfig

[[autodoc]] DDPOConfig

## DDPOTrainer

[[autodoc]] DDPOTrainer

## DPOTrainer

[[autodoc]] DPOTrainer

## IterativeSFTTrainer

[[autodoc]] IterativeSFTTrainer

## KTOConfig

[[autodoc]] KTOConfig

## KTOTrainer

[[autodoc]] KTOTrainer

## ORPOConfig

[[autodoc]] ORPOConfig

## ORPOTrainer

[[autodoc]] ORPOTrainer

## PPOConfig

[[autodoc]] PPOConfig
Expand All @@ -24,22 +65,6 @@ We also support a `RewardTrainer` that can be used to train a reward model.

[[autodoc]] SFTTrainer

## DPOTrainer

[[autodoc]] DPOTrainer

## DDPOConfig

[[autodoc]] DDPOConfig

## DDPOTrainer

[[autodoc]] DDPOTrainer

## IterativeSFTTrainer

[[autodoc]] IterativeSFTTrainer

## set_seed

[[autodoc]] set_seed

0 comments on commit dc6a934

Please sign in to comment.