Skip to content

Commit

Permalink
Updating launcher + docs.
Browse files Browse the repository at this point in the history
  • Loading branch information
Narsil committed Dec 4, 2023
1 parent 9af85b6 commit 020145b
Showing 1 changed file with 8 additions and 0 deletions.
8 changes: 8 additions & 0 deletions docs/source/basic_tutorials/launcher.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,14 @@ Options:
- bitsandbytes-nf4: Bitsandbytes 4bit. Can be applied on any model, will cut the memory requirement by 4x, but it is known that the model will be much slower to run than the native f16
- bitsandbytes-fp4: Bitsandbytes 4bit. nf4 should be preferred in most cases but maybe this one has better perplexity performance for you model

```
## SPECULATE
```shell
--speculate <SPECULATE>
The number of input_ids to speculate on If using a medusa model, the heads will be picked up automatically Other wise, it will use n-gram speculation which is relatively free in terms of compute, but the speedup heavily depends on the task

[env: SPECULATE=]

```
## DTYPE
```shell
Expand Down

0 comments on commit 020145b

Please sign in to comment.