[doc] Update adapters.md (#2621)

deepjavalibrary · Dec 4, 2024 · c38d659 · c38d659
1 parent 9f7bd54
commit c38d659
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/serving/docs/adapters.md b/serving/docs/adapters.md
@@ -32,7 +32,7 @@ Here are the settings that are available when using LoRA Adapter.
 | Item                             | Environment Variable             | LMI Version | Configuration Type | Description                                                                                                                                                                                                                                                                          | Example value                        |
 |----------------------------------|----------------------------------|-------------|--------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------|
 | option.enable_lora               | OPTION_ENABLE_LORA               | \>= 0.27.0  | Pass Through       | This config enables support for LoRA adapters.                                                                                                                                                                                                                                       | Default: `false`                     |
-| option.max_loras                 | OPTION_MAX_LORAS                 | \>= 0.27.0  | Pass Through       | This config determines the maximum number of LoRA adapters that can be run at once. Allocates GPU memory for those number adapters.                                                                                                                                                  | Default: `4`                         |
+| option.max_loras                 | OPTION_MAX_LORAS                 | \>= 0.27.0  | Pass Through       | This config determines the maximum number of unique LoRA adapters that can be run in a single batch.                                                                                                                                                                                 | Default: `4`                         |
 | option.max_lora_rank             | OPTION_MAX_LORA_RANK             | \>= 0.27.0  | Pass Through       | This config determines the maximum rank allowed for a LoRA adapter. Set this value to maximum rank of your adapters. Setting a larger value will enable more adapters at a greater memory usage cost.                                                                                | Default: `16`                        |
 | option.max_cpu_loras             | OPTION_MAX_CPU_LORAS             | \>= 0.27.0  | Pass Through       | Maximum number of LoRAs to store in CPU memory. Must be >= than max_loras. Defaults to max_loras.                                                                                                                                                                                    | Default: `None`                      |
 | option.fully_sharded_loras       | OPTION_FULLY_SHARDED_LORAS       | \>= 0.31.0  | Pass Through       | By default, only half of the LoRA computation is sharded with tensor parallelism. Enabling this will use the fully sharded layers. At high sequence length, max rank or tensor parallel size, this is likely faster.                                                                 | Default: `true`                      |