chat cli (#1431)

* first draft * move chat to cli * fix makefile * make script less verbose * fix parsing * fix style * add more examples * fix setup.py * add copyright * fix verbose init * attribute FastChat * add docs
huggingface · Mar 19, 2024 · 4e622a9 · 4e622a9
1 parent eb2d5b2
commit 4e622a9
Show file tree

Hide file tree

Showing 7 changed files with 456 additions and 17 deletions.
diff --git a/Makefile b/Makefile
@@ -9,6 +9,7 @@ COMMAND_FILES_PATH = `pwd`/commands
 dev:
 	[ -L "$(pwd)/trl/commands/scripts" ] && unlink "$(pwd)/trl/commands/scripts" || true
 	pip install -e ".[dev]"
+	ln -s `pwd`/examples/scripts/ `pwd`/trl/commands
 
 test:
 	python -m pytest -n auto --dist=loadfile -s -v ./tests/

diff --git a/docs/source/clis.mdx b/docs/source/clis.mdx
@@ -1,17 +1,18 @@
 # Command Line Interfaces (CLIs)
 
-You can use TRL to fine-tune your Language Model on Supervised Fine-Tuning (SFT) or Direct Policy Optimization (DPO) using the TRL CLIs.
+You can use TRL to fine-tune your Language Model with Supervised Fine-Tuning (SFT) or Direct Policy Optimization (DPO) or even chat with your model using the TRL CLIs.
 
 Currently supported CLIs are:
 
-- `trl sft`
-- `trl dpo`
+- `trl sft`: fine-tune a LLM on a text/instruction dataset
+- `trl dpo`: fine-tune a LLM with DPO on a preference dataset 
+- `trl chat`: quickly spin up a LLM fine-tuned for chatting
 
-## Get started 
+## Fine-tuning with the CLI
 
 Before getting started, pick up a Language Model from Hugging Face Hub. Supported models can be found with the filter "text-generation" within models. Also make sure to pick up a relevant dataset for your task.
 
-Also make sure to run:
+Before using the `sft` or `dpo` commands make sure to run:
 ```bash
 accelerate config
 ```
@@ -42,7 +43,7 @@ trl sft --config example_config.yaml --output_dir test-trl-cli --lr_scheduler_ty
 
 Will force-use `cosine_with_restarts` for `lr_scheduler_type`.
 
-## Supported Arguments 
+### Supported Arguments 
 
 We do support all arguments from `transformers.TrainingArguments`, for loading your model, we support all arguments from `~trl.ModelConfig`:
 
@@ -84,4 +85,25 @@ Once your dataset being pushed, run the dpo CLI as follows:
 trl dpo --config config.yaml --output_dir your-output-dir 
 ```
 
-The SFT CLI is based on the `examples/scripts/dpo.py` script.
+The SFT CLI is based on the `examples/scripts/dpo.py` script.
+
+## Chat interface
+
+The chat CLI lets you quickly load the model and talk to it. Simply run the following:
+
+```bash
+trl chat --model  Qwen/Qwen1.5-0.5B-Chat 
+```
+
+Note that the chat interface relies on the chat template of the tokenizer to format the inputs for the model. Make sure your tokenizer has a chat template defined.
+
+Besides talking to the model there are a few commands you can use:
+
+- **clear**: clears the current conversation and start a new one
+- **example {NAME}**: load example named `{NAME}` from the config and use it as the user input
+- **set {SETTING_NAME}={SETTING_VALUE};**: change the system prompt or generation settings (multiple settings are separated by a ';').
+- **reset**: same as clear but also resets the generation configs to defaults if they have been changed by **set**
+- **save {SAVE_NAME} (optional)**: save the current chat and settings to file by default to `./chat_history/{MODEL_NAME}/chat_{DATETIME}.yaml` or `{SAVE_NAME}` if provided
+- **exit**: closes the interface
+
+The default examples are defined in `examples/scripts/config/default_chat_config.yaml` but you can pass your own with `--config CONIG_FILE` where you can also specify the default generation parameters.