Skip to content

Commit

Permalink
fix docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Jintao-Huang committed Aug 5, 2024
1 parent 2ea9b8e commit 341c067
Show file tree
Hide file tree
Showing 34 changed files with 4 additions and 35 deletions.
2 changes: 1 addition & 1 deletion docs/source/LLM/命令行参数.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@
- `--bnb_4bit_quant_type`: 4bit量化时的量化方式, 默认是`'nf4'`. 可选择的值包括: 'nf4', 'fp4'. 当quantization_bit为0时, 该参数无效.
- `--bnb_4bit_use_double_quant`: 是否在4bit量化时开启double量化, 默认为`True`. 当quantization_bit为0时, 该参数无效.
- `--bnb_4bit_quant_storage`: 默认值为`None`. 量化参数的存储类型. 若`quantization_bit`设置为0, 则该参数失效.
- `--target_modules`: 指定lora模块, 默认为`['DEFAULT']`. 如果target_modules传入`'DEFAULT'` or `'AUTO'`, 则根据`model_type`查找`MODEL_MAPPING`中的`target_modules`(默认指定为qkv). 如果传入`'ALL'`, 则将所有的Linear层(不含head)指定为lora模块. 如果传入`'EMBEDDING'`, 则Embedding层指定为lora模块. 如果内存允许, 建议设置成'ALL'. 当然, 你也可以设置`['ALL', 'EMBEDDING']`, 将所有的Linear和embedding层指定为lora模块. 该参数在使用lora/vera/boft/ia3/adalora/fourierft时生效.
- `--target_modules`: 指定lora模块, 默认为`['DEFAULT']`. 如果target_modules传入`'DEFAULT'` or `'AUTO'`, 则根据`model_type`查找`MODEL_MAPPING`中的`target_modules`(LLM默认指定为qkv, MLLM默认为llm和projector中所有的linear). 如果传入`'ALL'`, 则将所有的Linear层(不含head)指定为lora模块. 如果传入`'EMBEDDING'`, 则Embedding层指定为lora模块. 如果内存允许, 建议设置成'ALL'. 当然, 你也可以设置`['ALL', 'EMBEDDING']`, 将所有的Linear和embedding层指定为lora模块. 该参数在使用lora/vera/boft/ia3/adalora/fourierft时生效.
- `--target_regex`: 指定lora模块的regex表达式, `Optional[str]`类型. 默认为`None`, 如果该值传入, 则target_modules不生效.该参数在使用lora/vera/boft/ia3/adalora/fourierft时生效.
- `--lora_rank`: 默认为`8`. 只有当`sft_type`指定为'lora'时才生效.
- `--lora_alpha`: 默认为`32`. 只有当`sft_type`指定为'lora'时才生效.
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/cogvlm2-video最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,6 @@ response: The video shows a person lighting a fire in a backyard setting. The pe
## 微调
多模态大模型微调通常使用**自定义数据集**进行微调. 这里展示可直接运行的demo:

(默认对LLM的qkv进行lora微调. 如果你想对所有linear都进行微调, 可以指定`--lora_target_modules ALL`)
```shell
# Experimental environment: A100
# 40GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/cogvlm2最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,6 @@ road:
## 微调
多模态大模型微调通常使用**自定义数据集**进行微调. 这里展示可直接运行的demo:

(默认对语言和视觉模型的qkv进行lora微调. 如果你想对所有linear都进行微调, 可以指定`--lora_target_modules ALL`)
```shell
# Experimental environment: A100
# 70GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/cogvlm最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,6 @@ road:
## 微调
多模态大模型微调通常使用**自定义数据集**进行微调. 这里展示可直接运行的demo:

(默认对语言和视觉模型的qkv进行lora微调. 如果你想对所有linear都进行微调, 可以指定`--lora_target_modules ALL`)
```shell
# Experimental environment: A100
# 50GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/deepseek-vl最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,6 @@ road:

LoRA微调:

(默认只对LLM部分的qkv进行lora微调. 如果你想对所有linear含vision模型部分都进行微调, 可以指定`--lora_target_modules ALL`)
```shell
# Experimental environment: A10, 3090, V100
# 20GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/glm4v最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,6 @@ road:
## 微调
多模态大模型微调通常使用**自定义数据集**进行微调. 这里展示可直接运行的demo:

(默认对语言和视觉模型的qkv进行lora微调. 如果你想对所有linear都进行微调, 可以指定`--lora_target_modules ALL`)
```shell
# Experimental environment: A100
# 40GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/internlm-xcomposer2最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,6 @@ road:
## 微调
多模态大模型微调通常使用**自定义数据集**进行微调. 这里展示可直接运行的demo:

(默认只对LLM部分的qkv进行lora微调. 不支持`--lora_target_modules ALL`. 支持全参数微调.)
```shell
# Experimental environment: A10, 3090, V100, ...
# 21GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/internvl最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -296,7 +296,6 @@ road:
LoRA微调:

**注意**
- 默认只对LLM部分的qkv进行lora微调. 如果你想对所有linear含vision模型部分都进行微调, 可以指定`--lora_target_modules ALL`.
- 如果你的GPU不支持flash attention, 使用参数`--use_flash_attn false`

```shell
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/llava-video最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,6 @@ response: 在这张图像中,有四只羊。
LoRA微调:
(默认只对LLM部分的qkv进行lora微调. 如果你想对所有linear含vision模型部分都进行微调, 可以指定`--lora_target_modules ALL`.)
```shell
# Experimental environment: A10, 3090, V100...
# 21GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/llava最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,6 @@ road:

LoRA微调:

(默认只对LLM部分的qkv进行lora微调. 如果你想对所有linear含vision模型部分都进行微调, 可以指定`--lora_target_modules ALL`.)
```shell
# Experimental environment: A10, 3090, V100...
# 21GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/minicpm-v-2.5最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,6 @@ road:
## 微调
多模态大模型微调通常使用**自定义数据集**进行微调. 这里展示可直接运行的demo:

(默认只对LLM部分的qkv进行lora微调. 如果你想对所有linear含vision模型部分都进行微调, 可以指定`--lora_target_modules ALL`. 支持全参数微调.)
```shell
# Experimental environment: 3090
# 20GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/minicpm-v-2最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,6 @@ road:
## 微调
多模态大模型微调通常使用**自定义数据集**进行微调. 这里展示可直接运行的demo:

(默认只对LLM部分的qkv进行lora微调. 如果你想对所有linear含vision模型部分都进行微调, 可以指定`--lora_target_modules ALL`. 支持全参数微调.)
```shell
# Experimental environment: A10, 3090, V100, ...
# 10GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/minicpm-v最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,6 @@ road:
## 微调
多模态大模型微调通常使用**自定义数据集**进行微调. 这里展示可直接运行的demo:

(默认只对LLM部分的qkv进行lora微调. 如果你想对所有linear含vision模型部分都进行微调, 可以指定`--lora_target_modules ALL`. 支持全参数微调.)
```shell
# Experimental environment: A10, 3090, V100, ...
# 10GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/mplug-owl2最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,6 @@ road:
## 微调
多模态大模型微调通常使用**自定义数据集**进行微调. 这里展示可直接运行的demo:

(默认只对LLM部分的qkv进行lora微调. 如果你想对所有linear含vision模型部分都进行微调, 可以指定`--lora_target_modules ALL`. 支持全参数微调.)
```shell
# Experimental environment: A10, 3090, V100...
# 24GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/phi3-vision最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,6 @@ road:
## 微调
多模态大模型微调通常使用**自定义数据集**进行微调. 这里展示可直接运行的demo:

(默认只对LLM部分的qkv进行lora微调. 如果你想对所有linear含vision模型部分都进行微调, 可以指定`--lora_target_modules ALL`. 支持全参数微调.)
```shell
# Experimental environment: A10, 3090, V100, ...
# 16GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/qwen-audio最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,6 @@ history: [['Audio 1:<audio>http://modelscope-open.oss-cn-hangzhou.aliyuncs.com/i

LoRA微调:

(默认只对LLM部分的qkv进行lora微调. 如果你想对所有linear含audio模型部分都进行微调, 可以指定`--lora_target_modules ALL`)
```shell
# Experimental environment: A10, 3090, V100...
# 22GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/qwen-vl最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,6 @@ road:

LoRA微调:

(默认只对LLM部分的qkv进行lora微调. 如果你想对所有linear含vision模型部分都进行微调, 可以指定`--lora_target_modules ALL`)
```shell
# Experimental environment: 3090
# 23GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source/Multi-Modal/yi-vl最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,6 @@ road:
## 微调
多模态大模型微调通常使用**自定义数据集**进行微调. 这里展示可直接运行的demo:

(默认只对LLM部分的qkv进行lora微调. 如果你想对所有linear含vision模型部分都进行微调, 可以指定`--lora_target_modules ALL`. 支持全参数微调.)
```shell
# Experimental environment: A10, 3090, V100...
# 19GB GPU memory
Expand Down
4 changes: 2 additions & 2 deletions docs/source_en/LLM/Command-line-parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@
- `--bnb_4bit_quant_type`: Quantization method for 4bit quantization, default is `'nf4'`. Options: 'nf4', 'fp4'. Has no effect when quantization_bit is 0.
- `--bnb_4bit_use_double_quant`: Whether to enable double quantization for 4bit quantization, default is `True`. Has no effect when quantization_bit is 0.
- `--bnb_4bit_quant_storage`: Default vlaue `None`.This sets the storage type to pack the quanitzed 4-bit prarams. Has no effect when quantization_bit is 0.
- `--target_modules`: Specify lora modules, default is `['DEFAULT']`. If target_modules is passed `'DEFAULT'` or `'AUTO'`, look up `target_modules` in `MODEL_MAPPING` based on `model_type` (default specifies qkv). If passed `'ALL'`, all Linear layers (excluding head) will be specified as lora modules. If passed `'EMBEDDING'`, Embedding layer will be specified as lora module. If memory allows, setting to 'ALL' is recommended. You can also set `['ALL', 'EMBEDDING']` to specify all Linear and embedding layers as lora modules. This parameter only takes effect when `sft_type` is 'lora'. This argument works when sft_type in lora/vera/boft/ia3/adalora/fourierft.
- `--target_modules`: Specify lora modules, default is `['DEFAULT']`. If target_modules is passed `'DEFAULT'` or `'AUTO'`, look up `target_modules` in `MODEL_MAPPING` based on `model_type` (The LLM is defaulted to qkv, while the MLLM defaults to all lines in the llm and projector.). If passed `'ALL'`, all Linear layers (excluding head) will be specified as lora modules. If passed `'EMBEDDING'`, Embedding layer will be specified as lora module. If memory allows, setting to 'ALL' is recommended. You can also set `['ALL', 'EMBEDDING']` to specify all Linear and embedding layers as lora modules. This parameter only takes effect when `sft_type` is 'lora'. This argument works when sft_type in lora/vera/boft/ia3/adalora/fourierft.
- `--target_regex`: The lora target regex in `Optional[str]`. default is `None`. If this argument is specified, the `target_modules` will have no effect. This argument works when sft_type in lora/vera/boft/ia3/adalora/fourierft.
- `--lora_rank`: Default is `8`. Only takes effect when `sft_type` is 'lora'.
- `--lora_alpha`: Default is `32`. Only takes effect when `sft_type` is 'lora'.
Expand Down Expand Up @@ -250,7 +250,7 @@ The following parameters take effect when `sft_type` is set to `ia3`.
PT parameters inherit from the SFT parameters with some modifications to the default values:

- `--sft_type`: Default value is `'full'`.
- `--lora_target_modules`: Default value is `'ALL'`.
- `--target_modules`: Default value is `'ALL'`.
- `--lazy_tokenize`: Default value is `True`.
- `--eval_steps`: Default value is `500`.

Expand Down
1 change: 0 additions & 1 deletion docs/source_en/Multi-Modal/cogvlm-best-practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,6 @@ road:
## Fine-tuning
Fine-tuning multimodal large models usually uses **custom datasets**. Here is a demo that can be run directly:

(By default, lora fine-tuning is performed on the qkv of the language and vision models. If you want to fine-tune all linears, you can specify `--lora_target_modules ALL`)
```shell
# Experimental environment: A100
# 50GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source_en/Multi-Modal/cogvlm2-best-practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,6 @@ road:
## Fine-tuning
Fine-tuning multimodal large models usually uses **custom datasets**. Here is a demo that can be run directly:

(By default, lora fine-tuning is performed on the qkv of the language and vision models. If you want to fine-tune all linears, you can specify `--lora_target_modules ALL`)
```shell
# Experimental environment: A100
# 70GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source_en/Multi-Modal/cogvlm2-video-best-practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,6 @@ response: The video shows a person lighting a fire in a backyard setting. The pe
## Fine-tuning
Fine-tuning multimodal large models usually uses **custom datasets**. Here is a demo that can be run directly:

(By default, lora fine-tuning is performed on the qkv of LLM. If you want to fine-tune all linears, you can specify `--lora_target_modules ALL`)
```shell
# Experimental environment: A100
# 40GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source_en/Multi-Modal/deepseek-vl-best-practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,6 @@ Multi-modal large model fine-tuning usually uses **custom datasets**. Here is a

LoRA fine-tuning:

(By default, only lora fine-tuning is performed on the qkv part of the LLM. If you want to fine-tune all linear parts including the vision model, you can specify `--lora_target_modules ALL`)
```shell
# Experimental environment: A10, 3090, V100
# 20GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source_en/Multi-Modal/glm4v-best-practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,6 @@ road:
## Fine-tuning
Fine-tuning multimodal large models usually uses **custom datasets**. Here is a demo that can be run directly:

(By default, lora fine-tuning is performed on the qkv of the language and vision models. If you want to fine-tune all linears, you can specify `--lora_target_modules ALL`)
```shell
# Experimental environment: A100
# 40GB GPU memory
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,6 @@ road:
## Fine-tuning
Fine-tuning of multimodal large models usually uses **custom datasets**. Here's a demo that can be run directly:

(By default, only the qkv part of the LLM is fine-tuned using Lora. `--lora_target_modules ALL` is not supported. Full-parameter fine-tuning is supported.)
```shell
# Experimental environment: A10, 3090, V100, ...
# 21GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source_en/Multi-Modal/internvl-best-practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -262,7 +262,6 @@ LoRA fine-tuning:

**note**
- If your GPU does not support flash attention, use the argument --use_flash_attn false.
- By default, only the qkv of the LLM part is fine-tuned using LoRA. If you want to fine-tune all linear layers including the vision model part, you can specify `--lora_target_modules ALL`.

```shell
# Experimental environment: A100
Expand Down
1 change: 0 additions & 1 deletion docs/source_en/Multi-Modal/llava-best-practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,6 @@ Multimodal large model fine-tuning usually uses **custom datasets** for fine-tun

LoRA fine-tuning:

(By default, only the qkv of the LLM part is fine-tuned using LoRA. If you want to fine-tune all linear layers including the vision model part, you can specify `--lora_target_modules ALL`.)
```shell
# Experimental environment: A10, 3090, V100...
# 21GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source_en/Multi-Modal/llava-video-best-practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,6 @@ Multimodal large model fine-tuning usually uses **custom datasets** for fine-tun

LoRA fine-tuning:

(By default, only the qkv of the LLM part is fine-tuned using LoRA. If you want to fine-tune all linear layers including the vision model part, you can specify `--lora_target_modules ALL`.)
```shell
# Experimental environment: A10, 3090, V100...
# 21GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source_en/Multi-Modal/minicpm-v-best-practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,6 @@ road:
## Fine-tuning
Fine-tuning multimodal large models usually uses **custom datasets**. Here is a demo that can be run directly:

(By default, only the qkv part of LLM is fine-tuned using LoRA. If you want to fine-tune all linear parts including the vision model, you can specify `--lora_target_modules ALL`. Full parameter fine-tuning is also supported.)
```shell
# Experimental environment: A10, 3090, V100, ...
# 10GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source_en/Multi-Modal/phi3-vision-best-practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,6 @@ road:
## Fine-tuning
Multimodal large model fine-tuning usually uses **custom datasets**. Here is a demo that can be run directly:

(Default fine-tune only the LLM part of qkv with lora. If you want to fine-tune all linear modules containing vision model parts, you can specify `--lora_target_modules ALL`. Support to fine-tune all parameters.)
```shell
# Experimental environment: A10, 3090, V100, ...
# 16GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source_en/Multi-Modal/qwen-audio-best-practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,6 @@ Multimodal large model fine-tuning usually uses **custom datasets** for fine-tun

LoRA fine-tuning:

(By default, only the qkv of the LLM part is lora fine-tuned. If you want to fine-tune all linear including the audio model part, you can specify `--lora_target_modules ALL`)
```shell
# Experimental environment: A10, 3090, V100...
# 22GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source_en/Multi-Modal/qwen-vl-best-practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,6 @@ Multimodal large model fine-tuning usually uses **custom datasets**. Here is a d

LoRA fine-tuning:

(By default, only the qkv part of the LLM is lora fine-tuned. If you want to fine-tune all linear modules including the vision model, you can specify `--lora_target_modules ALL`)
```shell
# Experimental environment: 3090
# 23GB GPU memory
Expand Down
1 change: 0 additions & 1 deletion docs/source_en/Multi-Modal/yi-vl-best-practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,6 @@ road:
## Fine-tuning
Fine-tuning multimodal large models usually uses **custom datasets**. Here shows a demo that can run directly:

(By default, only the qkv of the LLM part is lora fine-tuned. If you want to fine-tune all linears including the vision model part, you can specify `--lora_target_modules ALL`. Full parameter fine-tuning is also supported.)
```shell
# Experimental environment: A10, 3090, V100...
# 19GB GPU memory
Expand Down
2 changes: 1 addition & 1 deletion swift/llm/utils/argument.py
Original file line number Diff line number Diff line change
Expand Up @@ -1582,7 +1582,7 @@ def __post_init__(self):
@dataclass
class PtArguments(SftArguments):
sft_type: Literal['lora', 'full', 'longlora', 'adalora', 'ia3', 'llamapro', 'vera', 'boft'] = 'full'
lora_target_modules: List[str] = field(default_factory=lambda: ['ALL'])
target_modules: List[str] = field(default_factory=lambda: ['ALL'])
lazy_tokenize: Optional[bool] = True
eval_steps: int = 500

Expand Down

0 comments on commit 341c067

Please sign in to comment.