Skip to content

Commit

Permalink
chat cli (#1431)
Browse files Browse the repository at this point in the history
* first draft

* move chat to cli

* fix makefile

* make script less verbose

* fix parsing

* fix style

* add more examples

* fix setup.py

* add copyright

* fix verbose init

* attribute FastChat

* add docs
  • Loading branch information
lvwerra authored Mar 19, 2024
1 parent eb2d5b2 commit 4e622a9
Show file tree
Hide file tree
Showing 7 changed files with 456 additions and 17 deletions.
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ COMMAND_FILES_PATH = `pwd`/commands
dev:
[ -L "$(pwd)/trl/commands/scripts" ] && unlink "$(pwd)/trl/commands/scripts" || true
pip install -e ".[dev]"
ln -s `pwd`/examples/scripts/ `pwd`/trl/commands

test:
python -m pytest -n auto --dist=loadfile -s -v ./tests/
Expand Down
36 changes: 29 additions & 7 deletions docs/source/clis.mdx
Original file line number Diff line number Diff line change
@@ -1,17 +1,18 @@
# Command Line Interfaces (CLIs)

You can use TRL to fine-tune your Language Model on Supervised Fine-Tuning (SFT) or Direct Policy Optimization (DPO) using the TRL CLIs.
You can use TRL to fine-tune your Language Model with Supervised Fine-Tuning (SFT) or Direct Policy Optimization (DPO) or even chat with your model using the TRL CLIs.

Currently supported CLIs are:

- `trl sft`
- `trl dpo`
- `trl sft`: fine-tune a LLM on a text/instruction dataset
- `trl dpo`: fine-tune a LLM with DPO on a preference dataset
- `trl chat`: quickly spin up a LLM fine-tuned for chatting

## Get started
## Fine-tuning with the CLI

Before getting started, pick up a Language Model from Hugging Face Hub. Supported models can be found with the filter "text-generation" within models. Also make sure to pick up a relevant dataset for your task.

Also make sure to run:
Before using the `sft` or `dpo` commands make sure to run:
```bash
accelerate config
```
Expand Down Expand Up @@ -42,7 +43,7 @@ trl sft --config example_config.yaml --output_dir test-trl-cli --lr_scheduler_ty

Will force-use `cosine_with_restarts` for `lr_scheduler_type`.

## Supported Arguments
### Supported Arguments

We do support all arguments from `transformers.TrainingArguments`, for loading your model, we support all arguments from `~trl.ModelConfig`:

Expand Down Expand Up @@ -84,4 +85,25 @@ Once your dataset being pushed, run the dpo CLI as follows:
trl dpo --config config.yaml --output_dir your-output-dir
```

The SFT CLI is based on the `examples/scripts/dpo.py` script.
The SFT CLI is based on the `examples/scripts/dpo.py` script.

## Chat interface

The chat CLI lets you quickly load the model and talk to it. Simply run the following:

```bash
trl chat --model Qwen/Qwen1.5-0.5B-Chat
```

Note that the chat interface relies on the chat template of the tokenizer to format the inputs for the model. Make sure your tokenizer has a chat template defined.

Besides talking to the model there are a few commands you can use:

- **clear**: clears the current conversation and start a new one
- **example {NAME}**: load example named `{NAME}` from the config and use it as the user input
- **set {SETTING_NAME}={SETTING_VALUE};**: change the system prompt or generation settings (multiple settings are separated by a ';').
- **reset**: same as clear but also resets the generation configs to defaults if they have been changed by **set**
- **save {SAVE_NAME} (optional)**: save the current chat and settings to file by default to `./chat_history/{MODEL_NAME}/chat_{DATETIME}.yaml` or `{SAVE_NAME}` if provided
- **exit**: closes the interface

The default examples are defined in `examples/scripts/config/default_chat_config.yaml` but you can pass your own with `--config CONIG_FILE` where you can also specify the default generation parameters.
Loading

0 comments on commit 4e622a9

Please sign in to comment.