Skip to content

Commit

Permalink
Speculative (#1308)
Browse files Browse the repository at this point in the history
  • Loading branch information
Narsil authored Dec 11, 2023
1 parent 3238c49 commit 9ecfa16
Show file tree
Hide file tree
Showing 34 changed files with 1,349 additions and 282 deletions.
66 changes: 33 additions & 33 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions docs/source/basic_tutorials/launcher.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,14 @@ Options:
- bitsandbytes-nf4: Bitsandbytes 4bit. Can be applied on any model, will cut the memory requirement by 4x, but it is known that the model will be much slower to run than the native f16
- bitsandbytes-fp4: Bitsandbytes 4bit. nf4 should be preferred in most cases but maybe this one has better perplexity performance for you model

```
## SPECULATE
```shell
--speculate <SPECULATE>
The number of input_ids to speculate on If using a medusa model, the heads will be picked up automatically Other wise, it will use n-gram speculation which is relatively free in terms of compute, but the speedup heavily depends on the task

[env: SPECULATE=]

```
## DTYPE
```shell
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
{
"details": {
"best_of_sequences": null,
"finish_reason": "length",
"generated_tokens": 10,
"prefill": [
{
"id": 1,
"logprob": null,
"text": "<s>"
},
{
"id": 338,
"logprob": -10.0078125,
"text": "is"
},
{
"id": 21784,
"logprob": -15.515625,
"text": "Deep"
},
{
"id": 29257,
"logprob": -2.8847656,
"text": "Learning"
},
{
"id": 29973,
"logprob": -4.140625,
"text": "?"
}
],
"seed": 0,
"tokens": [
{
"id": 13,
"logprob": -1.1582031,
"special": false,
"text": "\n"
},
{
"id": 2772,
"logprob": -0.23083496,
"special": false,
"text": "De"
},
{
"id": 1022,
"logprob": 0.0,
"special": false,
"text": "ep"
},
{
"id": 6509,
"logprob": 0.0,
"special": false,
"text": " learning"
},
{
"id": 29892,
"logprob": -0.61816406,
"special": false,
"text": ","
},
{
"id": 607,
"logprob": -0.7089844,
"special": false,
"text": " which"
},
{
"id": 508,
"logprob": -1.7724609,
"special": false,
"text": " can"
},
{
"id": 367,
"logprob": 0.0,
"special": false,
"text": " be"
},
{
"id": 5545,
"logprob": 0.0,
"special": false,
"text": " considered"
},
{
"id": 408,
"logprob": -0.3869629,
"special": false,
"text": " as"
}
]
},
"generated_text": "What is Deep Learning?\nDeep learning, which can be considered as"
}
Loading

0 comments on commit 9ecfa16

Please sign in to comment.