From 4b8b0b749aca0fac3474cd1d8976bcc5487a7ad1 Mon Sep 17 00:00:00 2001 From: Joao Gante Date: Mon, 29 Apr 2024 16:38:59 +0000 Subject: [PATCH] update links --- docs/source/en/llm_tutorial.md | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/docs/source/en/llm_tutorial.md b/docs/source/en/llm_tutorial.md index 8d6372e129cc47..ae0c42f4848ef0 100644 --- a/docs/source/en/llm_tutorial.md +++ b/docs/source/en/llm_tutorial.md @@ -247,10 +247,11 @@ While the autoregressive generation process is relatively straightforward, makin ### Advanced generate usage -1. [Guide](generation_strategies) on how to control different generation methods, how to set up the generation configuration file, and how to stream the output; -2. [Guide](chat_templating) on the prompt template for chat LLMs; -3. [Guide](tasks/prompting) on to get the most of prompt design; -4. API reference on [`~generation.GenerationConfig`], [`~generation.GenerationMixin.generate`], and [generate-related classes](internal/generation_utils). Most of the classes, including the logits processors, have usage examples! +1. Guide on how to [control different generation methods](generation_strategies), how to set up the generation configuration file, and how to stream the output; +2. [Accelerating text generation](llm_optims); +3. [Prompt templates for chat LLMs](chat_templating); +4. [Prompt design guide](tasks/prompting); +5. API reference on [`~generation.GenerationConfig`], [`~generation.GenerationMixin.generate`], and [generate-related classes](internal/generation_utils). Most of the classes, including the logits processors, have usage examples! ### LLM leaderboards @@ -259,10 +260,12 @@ While the autoregressive generation process is relatively straightforward, makin ### Latency, throughput and memory utilization -1. [Guide](llm_tutorial_optimization) on how to optimize LLMs for speed and memory; -2. [Guide](main_classes/quantization) on quantization such as bitsandbytes and autogptq, which shows you how to drastically reduce your memory requirements. +1. Guide on how to [optimize LLMs for speed and memory](llm_tutorial_optimization); +2. Guide on [quantization](main_classes/quantization) such as bitsandbytes and autogptq, which shows you how to drastically reduce your memory requirements. ### Related libraries -1. [`text-generation-inference`](https://github.com/huggingface/text-generation-inference), a production-ready server for LLMs; -2. [`optimum`](https://github.com/huggingface/optimum), an extension of 🤗 Transformers that optimizes for specific hardware devices. +1. [`optimum`](https://github.com/huggingface/optimum), an extension of 🤗 Transformers that optimizes for specific hardware devices. +2. [`outlines`](https://github.com/outlines-dev/outlines), a library where you can constrain text generation (e.g. to generate JSON files); +3. [`text-generation-inference`](https://github.com/huggingface/text-generation-inference), a production-ready server for LLMs; +4. [`text-generation-webui`](https://github.com/oobabooga/text-generation-webui), a UI for text generation;