From 38e8945a818d5db80ab3da3315657fe924d8a32d Mon Sep 17 00:00:00 2001 From: lvliang-intel Date: Thu, 28 Mar 2024 16:28:52 +0800 Subject: [PATCH] Update ChatQnA readme (#25) * Update ChatQnA readme Signed-off-by: lvliang-intel * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: lvliang-intel Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> --- ChatQnA/README.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/ChatQnA/README.md b/ChatQnA/README.md index 5896a987..99ddc49c 100644 --- a/ChatQnA/README.md +++ b/ChatQnA/README.md @@ -1,3 +1,13 @@ +# ChatQnA Application + +Chatbots are the most widely adopted use case for leveraging the powerful chat and reasoning capabilities of large language models (LLM). The retrieval augmented generation (RAG) architecture is quickly becoming the industry standard for developing chatbots because it combines the benefits of a knowledge base (via a vector store) and generative models to reduce hallucinations, maintain up-to-date information, and leverage domain-specific knowledge. + +RAG bridges the knowledge gap by dynamically fetching relevant information from external sources, ensuring that responses generated remain factual and current. At the heart of this architecture are vector databases, instrumental in enabling efficient and semantic retrieval of information. These databases store data as vectors, allowing RAG to swiftly access the most pertinent documents or data points based on semantic similarity. + +ChatQnA architecture shows below: + +![architecture](https://i.imgur.com/lLOnQio.png) + This ChatQnA use case performs RAG using LangChain, Redis vectordb and Text Generation Inference on Intel Gaudi2. The Intel Gaudi2 accelerator supports both training and inference for deep learning models in particular for LLMs. Please visit [Habana AI products](https://habana.ai/products) for more details. # Environment Setup