Skip to content

Commit

Permalink
docs nits
Browse files Browse the repository at this point in the history
  • Loading branch information
gante committed May 29, 2024
1 parent dd5587b commit c95c07a
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion docs/source/en/llm_optims.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ tokenizer.batch_decode(outputs, skip_special_tokens=True)
Be mindful that full [`~GenerationMixin.generate`] compilation has severe feature limitations, and is still under development. It can, however, be compiled without graph breaks.


Taking advantage of `transformers`' modularity, a [`StaticCache`] object can also be passed to the model's forward pass under the same `past_key_values` argument. Using this strategy, you can write your own function to decode the next token given the current token and position and cache position of previously generated tokens.
If you want to go further down a level, the [`StaticCache`] object can also be passed to the model's forward pass under the same `past_key_values` argument. Using this strategy, you can write your own function to decode the next token given the current token and position and cache position of previously generated tokens.

```py
from transformers import LlamaTokenizer, LlamaForCausalLM, StaticCache, logging
Expand Down

0 comments on commit c95c07a

Please sign in to comment.