Merge branch 'opea-project:main' into opea-aws-bedrock

opea-project · Dec 16, 2024 · 866d767 · 866d767
2 parents 929e6a9 + c955e5e
commit 866d767
Show file tree

Hide file tree

Showing 63 changed files with 567 additions and 3,957 deletions.
diff --git a/README.md b/README.md
@@ -125,20 +125,6 @@ class ExampleService:
         self.megaservice.flow_to(embedding, llm)
 ```
 
-## Gateway
-
-The `Gateway` serves as the interface for users to access the `Megaservice`, providing customized access based on user requirements. It acts as the entry point for incoming requests, routing them to the appropriate `Microservices` within the `Megaservice` architecture.
-
-`Gateways` support API definition, API versioning, rate limiting, and request transformation, allowing for fine-grained control over how users interact with the underlying `Microservices`. By abstracting the complexity of the underlying infrastructure, `Gateways` provide a seamless and user-friendly experience for interacting with the `Megaservice`.
-
-For example, the `Gateway` for `ChatQnA` can be built like this:
-
-```python
-from comps import ChatQnAGateway
-
-self.gateway = ChatQnAGateway(megaservice=self.megaservice, host="0.0.0.0", port=self.port)
-```
-
 ## Contributing to OPEA
 
 Welcome to the OPEA open-source community! We are thrilled to have you here and excited about the potential contributions you can bring to the OPEA platform. Whether you are fixing bugs, adding new GenAI components, improving documentation, or sharing your unique use cases, your contributions are invaluable.

diff --git a/comps/__init__.py b/comps/__init__.py
@@ -47,23 +47,6 @@
 from comps.cores.mega.orchestrator import ServiceOrchestrator
 from comps.cores.mega.orchestrator_with_yaml import ServiceOrchestratorWithYaml
 from comps.cores.mega.micro_service import MicroService, register_microservice, opea_microservices
-from comps.cores.mega.gateway import (
-    Gateway,
-    ChatQnAGateway,
-    CodeGenGateway,
-    CodeTransGateway,
-    DocSumGateway,
-    TranslationGateway,
-    SearchQnAGateway,
-    AudioQnAGateway,
-    RetrievalToolGateway,
-    FaqGenGateway,
-    VideoQnAGateway,
-    VisualQnAGateway,
-    MultimodalQnAGateway,
-    GraphragGateway,
-    AvatarChatbotGateway,
-)
 
 # Telemetry
 from comps.cores.telemetry.opea_telemetry import opea_telemetry

diff --git a/comps/agent/langchain/README.md b/comps/agent/langchain/README.md
@@ -11,11 +11,10 @@ We currently support the following types of agents:
 1. ReAct: use `react_langchain` or `react_langgraph` or `react_llama` as strategy. First introduced in this seminal [paper](https://arxiv.org/abs/2210.03629). The ReAct agent engages in "reason-act-observe" cycles to solve problems. Please refer to this [doc](https://python.langchain.com/v0.2/docs/how_to/migrate_agent/) to understand the differences between the langchain and langgraph versions of react agents. See table below to understand the validated LLMs for each react strategy.
 2. RAG agent: use `rag_agent` or `rag_agent_llama` strategy. This agent is specifically designed for improving RAG performance. It has the capability to rephrase query, check relevancy of retrieved context, and iterate if context is not relevant. See table below to understand the validated LLMs for each rag agent strategy.
 3. Plan and execute: `plan_execute` strategy. This type of agent first makes a step-by-step plan given a user request, and then execute the plan sequentially (or in parallel, to be implemented in future). If the execution results can solve the problem, then the agent will output an answer; otherwise, it will replan and execute again.
-4. SQL agent: use `sql_agent_llama` or `sql_agent` strategy. This agent is specifically designed and optimized for answering questions aabout data in SQL databases. For more technical details read descriptions [here](src/strategy/sqlagent/README.md).
 
 **Note**:
 
-1. Due to the limitations in support for tool calling by TGI and vllm, we have developed subcategories of agent strategies (`rag_agent_llama`, `react_llama` and `sql_agent_llama`) specifically designed for open-source LLMs served with TGI and vllm.
+1. Due to the limitations in support for tool calling by TGI and vllm, we have developed subcategories of agent strategies (`rag_agent_llama` and `react_llama`) specifically designed for open-source LLMs served with TGI and vllm.
 2. For advanced developers who want to implement their own agent strategies, please refer to [Section 5](#5-customize-agent-strategy) below.
 
 ### 1.2 LLM engine
@@ -26,16 +25,14 @@ Agents use LLM for reasoning and planning. We support 3 options of LLM engine:
 2. Open-source LLMs served with vllm. Follow the instructions in [Section 2.2.2](#222-start-agent-microservices-with-vllm).
 3. OpenAI LLMs via API calls. To use OpenAI llms, specify `llm_engine=openai` and `export OPENAI_API_KEY=<your-openai-key>`
 
-| Agent type       | `strategy` arg    | Validated LLMs (serving SW)                                                                                                                                                                                    | Notes                                                                                                                                                                                                                                                                                                                                           |
-| ---------------- | ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| ReAct            | `react_langchain` | [llama3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) (tgi-gaudi)                                                                                                                  | Only allows tools with one input variable                                                                                                                                                                                                                                                                                                       |
-| ReAct            | `react_langgraph` | GPT-4o-mini, [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) (vllm-gaudi),                                                                                               | if using vllm, need to specify `--enable-auto-tool-choice --tool-call-parser ${model_parser}`, refer to vllm docs for more info                                                                                                                                                                                                                 |
-| ReAct            | `react_llama`     | [llama3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) (tgi-gaudi) (vllm-gaudi)                                                                                                     | Recommended for open-source LLMs                                                                                                                                                                                                                                                                                                                |
-| RAG agent        | `rag_agent`       | GPT-4o-mini                                                                                                                                                                                                    |                                                                                                                                                                                                                                                                                                                                                 |
-| RAG agent        | `rag_agent_llama` | [llama3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) (tgi-gaudi) (vllm-gaudi)                                                                                                     | Recommended for open-source LLMs, only allows 1 tool with input variable to be "query"                                                                                                                                                                                                                                                          |
-| Plan and execute | `plan_execute`    | GPT-4o-mini, [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) (vllm-gaudi), [Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) (vllm-gaudi) | Currently, due to some issues with guaided decoding of vllm-gaudi, this strategy does not work properly with vllm-gaudi. We are actively debugging. Stay tuned. In the meanwhile, you can use OpenAI's models with this strategy.                                                                                                               |
-| SQL agent        | `sql_agent_llama` | [llama3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) (vllm-gaudi)                                                                                                                 | database query tool is natively integrated using Langchain's [QuerySQLDataBaseTool](https://python.langchain.com/api_reference/community/tools/langchain_community.tools.sql_database.tool.QuerySQLDataBaseTool.html#langchain_community.tools.sql_database.tool.QuerySQLDataBaseTool). User can also register their own tools with this agent. |
-| SQL agent        | `sql_agent`       | GPT-4o-mini                                                                                                                                                                                                    | database query tool is natively integrated using Langchain's [QuerySQLDataBaseTool](https://python.langchain.com/api_reference/community/tools/langchain_community.tools.sql_database.tool.QuerySQLDataBaseTool.html#langchain_community.tools.sql_database.tool.QuerySQLDataBaseTool). User can also register their own tools with this agent. |
+| Agent type       | `strategy` arg    | Validated LLMs (serving SW)                                                                                                                                                                                    | Notes                                                                                                                           |
+| ---------------- | ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------- |
+| ReAct            | `react_langchain` | [llama3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) (tgi-gaudi)                                                                                                                  | Only allows tools with one input variable                                                                                       |
+| ReAct            | `react_langgraph` | GPT-4o-mini, [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) (vllm-gaudi),                                                                                               | if using vllm, need to specify `--enable-auto-tool-choice --tool-call-parser ${model_parser}`, refer to vllm docs for more info |
+| ReAct            | `react_llama`     | [llama3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) (tgi-gaudi)                                                                                                                  | Recommended for open-source LLMs                                                                                                |
+| RAG agent        | `rag_agent`       | GPT-4o-mini                                                                                                                                                                                                    |                                                                                                                                 |
+| RAG agent        | `rag_agent_llama` | [llama3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) (tgi-gaudi)                                                                                                                  | Recommended for open-source LLMs, only allows 1 tool with input variable to be "query"                                          |
+| Plan and execute | `plan_execute`    | GPT-4o-mini, [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) (vllm-gaudi), [Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) (vllm-gaudi) |                                                                                                                                 |
 
 ### 1.3 Tools
 
@@ -123,12 +120,12 @@ Once microservice starts, user can use below script to invoke.
 
 ```bash
 curl http://${ip_address}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
-     "query": "What is OPEA project?"
+     "query": "What is the weather today in Austin?"
     }'
 
 # expected output
 
-data: 'The OPEA project is .....</s>' # just showing partial example here.
+data: 'The temperature in Austin today is 78°F.</s>'
 
 data: [DONE]
 
@@ -213,4 +210,4 @@ data: [DONE]
 ## 5. Customize agent strategy
 
 For advanced developers who want to implement their own agent strategies, you can add a separate folder in `src\strategy`, implement your agent by inherit the `BaseAgent` class, and add your strategy into the `src\agent.py`. The architecture of this agent microservice is shown in the diagram below as a reference.
-![Architecture Overview](assets/agent_arch.jpg)
+![Architecture Overview](agent_arch.jpg)
diff --git a/comps/agent/langchain/assets/agent_arch.jpg → comps/agent/langchain/agent_arch.jpg b/comps/agent/langchain/assets/agent_arch.jpg → comps/agent/langchain/agent_arch.jpg
diff --git a/comps/agent/langchain/assets/sql_agent.png b/comps/agent/langchain/assets/sql_agent.png
diff --git a/comps/agent/langchain/assets/sql_agent_llama.png b/comps/agent/langchain/assets/sql_agent_llama.png