A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA.
Its goal is to become the AUTOMATIC1111/stable-diffusion-webui of text generation.
The Tags of the Docker Image is the same as the release on the official repository. I.e. "v1.3.1" or "v1.4"
- 3 interface modes: default (two columns), notebook, and chat
- Multiple model backends: transformers, llama.cpp, ExLlama, ExLlamaV2, AutoGPTQ, GPTQ-for-LLaMa, CTransformers, AutoAWQ
- Dropdown menu for quickly switching between different models
- LoRA: load and unload LoRAs on the fly, train a new LoRA using QLoRA
- Precise instruction templates for chat mode, including Llama-2-chat, Alpaca, Vicuna, WizardLM, StableLM, and many others
- 4-bit, 8-bit, and CPU inference through the transformers library
- Use llama.cpp models with transformers samplers (
llamacpp_HF
loader) - Multimodal pipelines, including LLaVA and MiniGPT-4
- Extensions framework
- Custom chat characters
- Very efficient text streaming
- Markdown output with LaTeX rendering, to use for instance with GALACTICA
- OpenAI-compatible API server
To learn how to use the various features, check out the Documentation: https://github.com/oobabooga/text-generation-webui/wiki
- oobabooga/text-generation-webui
- Prebuild Docker Images by zjuuu
Clone the original repository: oobabooga/text-generation-webui. Copy & paste the Dockerfile. Run docker build -t yourimagename .