diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
old mode 100644
new mode 100755
index 5853274a1..3a6070efd
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -3,10 +3,10 @@
/ChatQnA/ liang1.lv@intel.com
/CodeGen/ liang1.lv@intel.com
/CodeTrans/ sihan.chen@intel.com
-/DocSum/ sihan.chen@intel.com
+/DocSum/ letong.han@intel.com
/DocIndexRetriever/ xuhui.ren@intel.com chendi.xue@intel.com
/FaqGen/ xinyao.wang@intel.com
-/SearchQnA/ letong.han@intel.com
+/SearchQnA/ sihan.chen@intel.com
/Translation/ liang1.lv@intel.com
/VisualQnA/ liang1.lv@intel.com
/ProductivitySuite/ hoong.tee.yeoh@intel.com
diff --git a/.github/workflows/pr-path-detection.yml b/.github/workflows/pr-path-detection.yml
index 6ad53c0aa..e45aca0df 100644
--- a/.github/workflows/pr-path-detection.yml
+++ b/.github/workflows/pr-path-detection.yml
@@ -94,14 +94,23 @@ jobs:
run: |
cd ${{github.workspace}}
fail="FALSE"
- link_head="https://github.com/opea-project/GenAIExamples/blob/main/"
- png_lines=$(grep -Eo '\]\([^)]+\)' -r -I .|grep -Ev 'http')
+ repo_name=${{ github.event.pull_request.head.repo.full_name }}
+ if [ "$(echo "$repo_name"|cut -d'/' -f1)" != "opea-project" ]; then
+ owner=$(echo "${{ github.event.pull_request.head.repo.full_name }}" |cut -d'/' -f1)
+ branch="https://github.com/$owner/GenAIExamples/tree/${{ github.event.pull_request.head.ref }}"
+ else
+ branch="https://github.com/opea-project/GenAIExamples/blob/${{ github.event.pull_request.head.ref }}"
+ fi
+ link_head="https://github.com/opea-project/GenAIExamples/blob/main"
+ png_lines=$(grep -Eo '\]\([^)]+\)' --include='*.md' -r .|grep -Ev 'http')
if [ -n "$png_lines" ]; then
for png_line in $png_lines; do
refer_path=$(echo "$png_line"|cut -d':' -f1 | cut -d'/' -f2-)
png_path=$(echo "$png_line"|cut -d '(' -f2 | cut -d ')' -f1)
if [[ "${png_path:0:1}" == "/" ]]; then
check_path=${{github.workspace}}$png_path
+ elif [[ "${png_path:0:1}" == "#" ]]; then
+ check_path=${{github.workspace}}/$refer_path$png_path
else
check_path=${{github.workspace}}/$(dirname "$refer_path")/$png_path
fi
@@ -110,7 +119,7 @@ jobs:
echo "Path $png_path in file ${{github.workspace}}/$refer_path does not exist"
fail="TRUE"
else
- url=$link_head$(echo "$real_path" | sed 's|.*/GenAIExamples/||')
+ url=$link_head$(echo "$real_path" | sed 's|.*/GenAIExamples||')
response=$(curl -I -L -s -o /dev/null -w "%{http_code}" "$url")
if [ "$response" -ne 200 ]; then
echo "**********Validation failed, try again**********"
@@ -118,8 +127,21 @@ jobs:
if [ "$response_retry" -eq 200 ]; then
echo "*****Retry successfully*****"
else
- echo "Invalid link from $check_path: $url"
- fail="TRUE"
+ echo "Retry failed. Check branch ${{ github.event.pull_request.head.ref }}"
+ url_dev=$branch$(echo "$real_path" | sed 's|.*/GenAIExamples||')
+ response=$(curl -I -L -s -o /dev/null -w "%{http_code}" "$url_dev")
+ if [ "$response" -ne 200 ]; then
+ echo "**********Validation failed, try again**********"
+ response_retry=$(curl -s -o /dev/null -w "%{http_code}" "$url_dev")
+ if [ "$response_retry" -eq 200 ]; then
+ echo "*****Retry successfully*****"
+ else
+ echo "Invalid link from $real_path: $url_dev"
+ fail="TRUE"
+ fi
+ else
+ echo "Check branch ${{ github.event.pull_request.head.ref }} successfully."
+ fi
fi
fi
fi
diff --git a/ChatQnA/README.md b/ChatQnA/README.md
index 0efd64725..fa7156ad0 100644
--- a/ChatQnA/README.md
+++ b/ChatQnA/README.md
@@ -4,8 +4,88 @@ Chatbots are the most widely adopted use case for leveraging the powerful chat a
RAG bridges the knowledge gap by dynamically fetching relevant information from external sources, ensuring that responses generated remain factual and current. The core of this architecture are vector databases, which are instrumental in enabling efficient and semantic retrieval of information. These databases store data as vectors, allowing RAG to swiftly access the most pertinent documents or data points based on semantic similarity.
-ChatQnA architecture shows below:
+## Deploy ChatQnA Service
+
+The ChatQnA service can be effortlessly deployed on Intel Gaudi2, Intel Xeon Scalable Processors and Nvidia GPU.
+
+Two types of ChatQnA pipeline are supported now: `ChatQnA with/without Rerank`. And the `ChatQnA without Rerank` pipeline (including Embedding, Retrieval, and LLM) is offered for Xeon customers who can not run rerank service on HPU yet require high performance and accuracy.
+
+Quick Start Deployment Steps:
+
+1. Set up the environment variables.
+2. Run Docker Compose.
+3. Consume the ChatQnA Service.
+
+### Quick Start: 1.Setup Environment Variable
+
+To set up environment variables for deploying ChatQnA services, follow these steps:
+
+1. Set the required environment variables:
+
+ ```bash
+ # Example: host_ip="192.168.1.1"
+ export host_ip="External_Public_IP"
+ # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
+ export no_proxy="Your_No_Proxy"
+ export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
+ ```
+
+2. If you are in a proxy environment, also set the proxy-related environment variables:
+
+ ```bash
+ export http_proxy="Your_HTTP_Proxy"
+ export https_proxy="Your_HTTPs_Proxy"
+ ```
+
+3. Set up other environment variables:
+
+ > Notice that you can only choose **one** command below to set up envs according to your hardware. Other that the port numbers may be set incorrectly.
+
+ ```bash
+ # on Gaudi
+ source ./docker_compose/intel/hpu/gaudi/set_env.sh
+ # on Xeon
+ source ./docker_compose/intel/cpu/xeon/set_env.sh
+ # on Nvidia GPU
+ source ./docker_compose/nvidia/gpu/set_env.sh
+ ```
+
+### Quick Start: 2.Run Docker Compose
+
+Select the compose.yaml file that matches your hardware.
+CPU example:
+
+```bash
+cd GenAIExamples/ChatQnA/docker_compose/intel/cpu/xeon/
+# cd GenAIExamples/ChatQnA/docker_compose/intel/hpu/gaudi/
+# cd GenAIExamples/ChatQnA/docker_compose/nvidia/gpu/
+docker compose up -d
+```
+
+It will automatically download the docker image on `docker hub`:
+
+```bash
+docker pull opea/chatqna:latest
+docker pull opea/chatqna-ui:latest
+```
+
+If you want to build docker by yourself, please refer to `built from source`: [Guide](docker_compose/intel/cpu/xeon/README.md).
+
+> Note: The optional docker image **opea/chatqna-without-rerank:latest** has not been published yet, users need to build this docker image from source.
+
+### QuickStart: 3.Consume the ChatQnA Service
+
+```bash
+curl http://${host_ip}:8888/v1/chatqna \
+ -H "Content-Type: application/json" \
+ -d '{
+ "messages": "What is the revenue of Nike in 2023?"
+ }'
+```
+
+## Architecture and Deploy details
+ChatQnA architecture shows below:
![architecture](./assets/img/chatqna_architecture.png)
The ChatQnA example is implemented using the component-level microservices defined in [GenAIComps](https://github.com/opea-project/GenAIComps). The flow chart below shows the information flow between different microservices for this example.
@@ -79,59 +159,22 @@ flowchart LR
direction TB
%% Vector DB interaction
- R_RET <-.->VDB
- DP <-.->VDB
-
-
-
+ R_RET <-.->|d|VDB
+ DP <-.->|d|VDB
```
This ChatQnA use case performs RAG using LangChain, Redis VectorDB and Text Generation Inference on [Intel Gaudi2](https://www.intel.com/content/www/us/en/products/details/processors/ai-accelerators/gaudi-overview.html) or [Intel Xeon Scalable Processors](https://www.intel.com/content/www/us/en/products/details/processors/xeon.html).
In the below, we provide a table that describes for each microservice component in the ChatQnA architecture, the default configuration of the open source project, hardware, port, and endpoint.
-
-Gaudi default compose.yaml
-
-| MicroService | Open Source Project | HW | Port | Endpoint |
+Gaudi default compose.yaml
+| MicroService | Open Source Project | HW | Port | Endpoint |
| ------------ | ------------------- | ----- | ---- | -------------------- |
-| Embedding | Langchain | Xeon | 6000 | /v1/embaddings |
-| Retriever | Langchain, Redis | Xeon | 7000 | /v1/retrieval |
-| Reranking | Langchain, TEI | Gaudi | 8000 | /v1/reranking |
-| LLM | Langchain, TGI | Gaudi | 9000 | /v1/chat/completions |
-| Dataprep | Redis, Langchain | Xeon | 6007 | /v1/dataprep |
-
-
-
-## Deploy ChatQnA Service
-
-The ChatQnA service can be effortlessly deployed on either Intel Gaudi2 or Intel Xeon Scalable Processors.
-
-Two types of ChatQnA pipeline are supported now: `ChatQnA with/without Rerank`. And the `ChatQnA without Rerank` pipeline (including Embedding, Retrieval, and LLM) is offered for Xeon customers who can not run rerank service on HPU yet require high performance and accuracy.
-
-### Prepare Docker Image
-
-Currently we support two ways of deploying ChatQnA services with docker compose:
-
-1. Using the docker image on `docker hub`:
-
- ```bash
- docker pull opea/chatqna:latest
- ```
-
- Two type of UI are supported now, choose one you like and pull the referred docker image.
-
- If you choose conversational UI, follow the [instruction](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/docker_compose/intel/hpu/gaudi#-launch-the-conversational-ui-optional) and modify the [compose.yaml](./docker_compose/intel/cpu/xeon/compose.yaml).
-
- ```bash
- docker pull opea/chatqna-ui:latest
- # or
- docker pull opea/chatqna-conversation-ui:latest
- ```
-
-2. Using the docker images `built from source`: [Guide](docker_compose/intel/cpu/xeon/README.md)
-
- > Note: The **opea/chatqna-without-rerank:latest** docker image has not been published yet, users need to build this docker image from source.
+| Embedding | Langchain | Xeon | 6000 | /v1/embaddings |
+| Retriever | Langchain, Redis | Xeon | 7000 | /v1/retrieval |
+| Reranking | Langchain, TEI | Gaudi | 8000 | /v1/reranking |
+| LLM | Langchain, TGI | Gaudi | 9000 | /v1/chat/completions |
+| Dataprep | Redis, Langchain | Xeon | 6007 | /v1/dataprep |
### Required Models
@@ -147,40 +190,6 @@ Change the `xxx_MODEL_ID` in `docker_compose/xxx/set_env.sh` for your needs.
For customers with proxy issues, the models from [ModelScope](https://www.modelscope.cn/models) are also supported in ChatQnA. Refer to [this readme](docker_compose/intel/cpu/xeon/README.md) for details.
-### Setup Environment Variable
-
-To set up environment variables for deploying ChatQnA services, follow these steps:
-
-1. Set the required environment variables:
-
- ```bash
- # Example: host_ip="192.168.1.1"
- export host_ip="External_Public_IP"
- # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
- export no_proxy="Your_No_Proxy"
- export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
- ```
-
-2. If you are in a proxy environment, also set the proxy-related environment variables:
-
- ```bash
- export http_proxy="Your_HTTP_Proxy"
- export https_proxy="Your_HTTPs_Proxy"
- ```
-
-3. Set up other environment variables:
-
- > Notice that you can only choose **one** command below to set up envs according to your hardware. Other that the port numbers may be set incorrectly.
-
- ```bash
- # on Gaudi
- source ./docker_compose/intel/hpu/gaudi/set_env.sh
- # on Xeon
- source ./docker_compose/intel/cpu/xeon/set_env.sh
- # on Nvidia GPU
- source ./docker_compose/nvidia/gpu/set_env.sh
- ```
-
### Deploy ChatQnA on Gaudi
Find the corresponding [compose.yaml](./docker_compose/intel/hpu/gaudi/compose.yaml).
diff --git a/ChatQnA/docker_compose/intel/cpu/xeon/README.md b/ChatQnA/docker_compose/intel/cpu/xeon/README.md
index e317d55c6..5eca0d284 100644
--- a/ChatQnA/docker_compose/intel/cpu/xeon/README.md
+++ b/ChatQnA/docker_compose/intel/cpu/xeon/README.md
@@ -2,6 +2,65 @@
This document outlines the deployment process for a ChatQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Xeon server. The steps include Docker image creation, container deployment via Docker Compose, and service execution to integrate microservices such as `embedding`, `retriever`, `rerank`, and `llm`. We will publish the Docker images to Docker Hub soon, it will simplify the deployment process for this service.
+Quick Start:
+
+1. Set up the environment variables.
+2. Run Docker Compose.
+3. Consume the ChatQnA Service.
+
+## Quick Start: 1.Setup Environment Variable
+
+To set up environment variables for deploying ChatQnA services, follow these steps:
+
+1. Set the required environment variables:
+
+ ```bash
+ # Example: host_ip="192.168.1.1"
+ export host_ip="External_Public_IP"
+ # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
+ export no_proxy="Your_No_Proxy"
+ export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
+ ```
+
+2. If you are in a proxy environment, also set the proxy-related environment variables:
+
+ ```bash
+ export http_proxy="Your_HTTP_Proxy"
+ export https_proxy="Your_HTTPs_Proxy"
+ ```
+
+3. Set up other environment variables:
+ ```bash
+ source ./set_env.sh
+ ```
+
+## Quick Start: 2.Run Docker Compose
+
+```bash
+docker compose up -d
+```
+
+It will automatically download the docker image on `docker hub`:
+
+```bash
+docker pull opea/chatqna:latest
+docker pull opea/chatqna-ui:latest
+```
+
+If you want to build docker by yourself, please refer to 'Build Docker Images' in below.
+
+> Note: The optional docker image **opea/chatqna-without-rerank:latest** has not been published yet, users need to build this docker image from source.
+
+## QuickStart: 3.Consume the ChatQnA Service
+
+```bash
+curl http://${host_ip}:8888/v1/chatqna \
+ -H "Content-Type: application/json" \
+ -d '{
+ "messages": "What is the revenue of Nike in 2023?"
+ }'
+```
+
## π Apply Xeon Server on AWS
To apply a Xeon server on AWS, start by creating an AWS account if you don't have one already. Then, head to the [EC2 Console](https://console.aws.amazon.com/ec2/v2/home) to begin the process. Within the EC2 service, select the Amazon EC2 M7i or M7i-flex instance type to leverage 4th Generation Intel Xeon Scalable processors that are optimized for demanding workloads.
diff --git a/ChatQnA/docker_compose/intel/hpu/gaudi/README.md b/ChatQnA/docker_compose/intel/hpu/gaudi/README.md
index 65c6115bb..ec8e3ad09 100644
--- a/ChatQnA/docker_compose/intel/hpu/gaudi/README.md
+++ b/ChatQnA/docker_compose/intel/hpu/gaudi/README.md
@@ -2,6 +2,66 @@
This document outlines the deployment process for a ChatQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Gaudi server. The steps include Docker image creation, container deployment via Docker Compose, and service execution to integrate microservices such as embedding, retriever, rerank, and llm. We will publish the Docker images to Docker Hub, it will simplify the deployment process for this service.
+Quick Start:
+
+1. Set up the environment variables.
+2. Run Docker Compose.
+3. Consume the ChatQnA Service.
+
+## Quick Start: 1.Setup Environment Variable
+
+To set up environment variables for deploying ChatQnA services, follow these steps:
+
+1. Set the required environment variables:
+
+ ```bash
+ # Example: host_ip="192.168.1.1"
+ export host_ip="External_Public_IP"
+ # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
+ export no_proxy="Your_No_Proxy"
+ export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
+ ```
+
+2. If you are in a proxy environment, also set the proxy-related environment variables:
+
+ ```bash
+ export http_proxy="Your_HTTP_Proxy"
+ export https_proxy="Your_HTTPs_Proxy"
+ ```
+
+3. Set up other environment variables:
+
+ ```bash
+ source ./set_env.sh
+ ```
+
+## Quick Start: 2.Run Docker Compose
+
+```bash
+docker compose up -d
+```
+
+It will automatically download the docker image on `docker hub`:
+
+```bash
+docker pull opea/chatqna:latest
+docker pull opea/chatqna-ui:latest
+```
+
+If you want to build docker by yourself, please refer to 'Build Docker Images' in below.
+
+> Note: The optional docker image **opea/chatqna-without-rerank:latest** has not been published yet, users need to build this docker image from source.
+
+## QuickStart: 3.Consume the ChatQnA Service
+
+```bash
+curl http://${host_ip}:8888/v1/chatqna \
+ -H "Content-Type: application/json" \
+ -d '{
+ "messages": "What is the revenue of Nike in 2023?"
+ }'
+```
+
## π Build Docker Images
First of all, you need to build Docker Images locally. This step can be ignored after the Docker images published to Docker hub.
diff --git a/ChatQnA/docker_compose/nvidia/gpu/README.md b/ChatQnA/docker_compose/nvidia/gpu/README.md
index cd064795f..7e3966a7f 100644
--- a/ChatQnA/docker_compose/nvidia/gpu/README.md
+++ b/ChatQnA/docker_compose/nvidia/gpu/README.md
@@ -2,6 +2,66 @@
This document outlines the deployment process for a ChatQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on NVIDIA GPU platform. The steps include Docker image creation, container deployment via Docker Compose, and service execution to integrate microservices such as embedding, retriever, rerank, and llm. We will publish the Docker images to Docker Hub, it will simplify the deployment process for this service.
+Quick Start Deployment Steps:
+
+1. Set up the environment variables.
+2. Run Docker Compose.
+3. Consume the ChatQnA Service.
+
+## Quick Start: 1.Setup Environment Variable
+
+To set up environment variables for deploying ChatQnA services, follow these steps:
+
+1. Set the required environment variables:
+
+ ```bash
+ # Example: host_ip="192.168.1.1"
+ export host_ip="External_Public_IP"
+ # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
+ export no_proxy="Your_No_Proxy"
+ export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
+ ```
+
+2. If you are in a proxy environment, also set the proxy-related environment variables:
+
+ ```bash
+ export http_proxy="Your_HTTP_Proxy"
+ export https_proxy="Your_HTTPs_Proxy"
+ ```
+
+3. Set up other environment variables:
+
+ ```bash
+ source ./set_env.sh
+ ```
+
+## Quick Start: 2.Run Docker Compose
+
+```bash
+docker compose up -d
+```
+
+It will automatically download the docker image on `docker hub`:
+
+```bash
+docker pull opea/chatqna:latest
+docker pull opea/chatqna-ui:latest
+```
+
+If you want to build docker by yourself, please refer to 'Build Docker Images' in below.
+
+> Note: The optional docker image **opea/chatqna-without-rerank:latest** has not been published yet, users need to build this docker image from source.
+
+## QuickStart: 3.Consume the ChatQnA Service
+
+```bash
+curl http://${host_ip}:8888/v1/chatqna \
+ -H "Content-Type: application/json" \
+ -d '{
+ "messages": "What is the revenue of Nike in 2023?"
+ }'
+```
+
## π Build Docker Images
First of all, you need to build Docker Images locally. This step can be ignored after the Docker images published to Docker hub.
diff --git a/GenAIExamples b/GenAIExamples
new file mode 160000
index 000000000..de397d104
--- /dev/null
+++ b/GenAIExamples
@@ -0,0 +1 @@
+Subproject commit de397d104153c2f5538a7d2eefa32b43b2918e43
diff --git a/MultimodalQnA/README.md b/MultimodalQnA/README.md
index fe6d1fd9a..022455042 100644
--- a/MultimodalQnA/README.md
+++ b/MultimodalQnA/README.md
@@ -91,7 +91,9 @@ flowchart LR
```
-This MultimodalQnA use case performs Multimodal-RAG using LangChain, Redis VectorDB and Text Generation Inference on Intel Gaudi2 or Intel Xeon Scalable Processors. The Intel Gaudi2 accelerator supports both training and inference for deep learning models in particular for LLMs. Visit [Habana AI products](https://habana.ai/products) for more details.
+This MultimodalQnA use case performs Multimodal-RAG using LangChain, Redis VectorDB and Text Generation Inference on [Intel Gaudi2](https://www.intel.com/content/www/us/en/products/details/processors/ai-accelerators/gaudi-overview.html) and [Intel Xeon Scalable Processors](https://www.intel.com/content/www/us/en/products/details/processors/xeon.html), and we invite contributions from other hardware vendors to expand the example.
+
+The Intel Gaudi2 accelerator supports both training and inference for deep learning models in particular for LLMs. Visit [Habana AI products](https://habana.ai/products) for more details.
In the below, we provide a table that describes for each microservice component in the MultimodalQnA architecture, the default configuration of the open source project, hardware, port, and endpoint.
diff --git a/README.md b/README.md
index 5a168648b..40aa39e1a 100644
--- a/README.md
+++ b/README.md
@@ -45,7 +45,7 @@ Deployment are based on released docker images by default, check [docker image l
| DocSum | [Xeon Instructions](DocSum/docker_compose/intel/cpu/xeon/README.md) | [Gaudi Instructions](DocSum/docker_compose/intel/hpu/gaudi/README.md) | [DocSum with Manifests](DocSum/kubernetes/intel/README.md) | [DocSum with Helm Charts](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts/docsum/README.md) | [DocSum with GMC](DocSum/kubernetes/intel/README_gmc.md) |
| SearchQnA | [Xeon Instructions](SearchQnA/docker_compose/intel/cpu/xeon/README.md) | [Gaudi Instructions](SearchQnA/docker_compose/intel/hpu/gaudi/README.md) | Not Supported | Not Supported | [SearchQnA with GMC](SearchQnA/kubernetes/intel/README_gmc.md) |
| FaqGen | [Xeon Instructions](FaqGen/docker_compose/intel/cpu/xeon/README.md) | [Gaudi Instructions](FaqGen/docker_compose/intel/hpu/gaudi/README.md) | [FaqGen with Manifests](FaqGen/kubernetes/intel/README.md) | Not Supported | [FaqGen with GMC](FaqGen/kubernetes/intel/README_gmc.md) |
-| Translation | [Xeon Instructions](Translation/docker_compose/intel/cpu/xeon/README.md) | [Gaudi Instructions](Translation/docker_compose/intel/hpu/gaudi/README.md) | Not Supported | Not Supported | [Translation with GMC](Translation/kubernetes/intel/README_gmc.md) |
+| Translation | [Xeon Instructions](Translation/docker_compose/intel/cpu/xeon/README.md) | [Gaudi Instructions](Translation/docker_compose/intel/hpu/gaudi/README.md) | [Translation with Manifests](Translation/kubernetes/intel/README.md) | Not Supported | [Translation with GMC](Translation/kubernetes/intel/README_gmc.md) |
| AudioQnA | [Xeon Instructions](AudioQnA/docker_compose/intel/cpu/xeon/README.md) | [Gaudi Instructions](AudioQnA/docker_compose/intel/hpu/gaudi/README.md) | [AudioQnA with Manifests](AudioQnA/kubernetes/intel/README.md) | Not Supported | [AudioQnA with GMC](AudioQnA/kubernetes/intel/README_gmc.md) |
| VisualQnA | [Xeon Instructions](VisualQnA/docker_compose/intel/cpu/xeon/README.md) | [Gaudi Instructions](VisualQnA/docker_compose/intel/hpu/gaudi/README.md) | [VisualQnA with Manifests](VisualQnA/kubernetes/intel/README.md) | Not Supported | [VisualQnA with GMC](VisualQnA/kubernetes/intel/README_gmc.md) |
| ProductivitySuite | [Xeon Instructions](ProductivitySuite/docker_compose/intel/cpu/xeon/README.md) | Not Supported | [ProductivitySuite with Manifests](ProductivitySuite/kubernetes/intel/README.md) | Not Supported | Not Supported |
@@ -54,9 +54,18 @@ Deployment are based on released docker images by default, check [docker image l
Check [here](./supported_examples.md) for detailed information of supported examples, models, hardwares, etc.
+## Contributing to OPEA
+
+Welcome to the OPEA open-source community! We are thrilled to have you here and excited about the potential contributions you can bring to the OPEA platform. Whether you are fixing bugs, adding new GenAI components, improving documentation, or sharing your unique use cases, your contributions are invaluable.
+
+Together, we can make OPEA the go-to platform for enterprise AI solutions. Let's work together to push the boundaries of what's possible and create a future where AI is accessible, efficient, and impactful for everyone.
+
+Please check the [Contributing guidelines](https://github.com/opea-project/docs/tree/main/community/CONTRIBUTING.md) for a detailed guide on how to contribute a GenAI component and all the ways you can contribute!
+
+Thank you for being a part of this journey. We can't wait to see what we can achieve together!
+
## Additional Content
- [Code of Conduct](https://github.com/opea-project/docs/tree/main/community/CODE_OF_CONDUCT.md)
-- [Contribution](https://github.com/opea-project/docs/tree/main/community/CONTRIBUTING.md)
- [Security Policy](https://github.com/opea-project/docs/tree/main/community/SECURITY.md)
- [Legal Information](/LEGAL_INFORMATION.md)
diff --git a/Translation/README.md b/Translation/README.md
index 37bfdd902..2df513baa 100644
--- a/Translation/README.md
+++ b/Translation/README.md
@@ -6,7 +6,7 @@ Translation architecture shows below:
![architecture](./assets/img/translation_architecture.png)
-This Translation use case performs Language Translation Inference on Intel Gaudi2 or Intel Xeon Scalable Processors. The Intel Gaudi2 accelerator supports both training and inference for deep learning models in particular for LLMs. Visit [Habana AI products](https://habana.ai/products/) for more details.
+This Translation use case performs Language Translation Inference across multiple platforms. Currently, we provide the example for [Intel Gaudi2](https://www.intel.com/content/www/us/en/products/details/processors/ai-accelerators/gaudi-overview.html) and [Intel Xeon Scalable Processors](https://www.intel.com/content/www/us/en/products/details/processors/xeon.html), and we invite contributions from other hardware vendors to expand OPEA ecosystem.
## Deploy Translation Service
diff --git a/Translation/docker_compose/intel/cpu/xeon/README.md b/Translation/docker_compose/intel/cpu/xeon/README.md
index 31e6e9654..306f8e35d 100644
--- a/Translation/docker_compose/intel/cpu/xeon/README.md
+++ b/Translation/docker_compose/intel/cpu/xeon/README.md
@@ -41,30 +41,59 @@ cd GenAIExamples/Translation/ui
docker build -t opea/translation-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f docker/Dockerfile .
```
+### 4. Build Nginx Docker Image
+
+```bash
+cd GenAIComps
+docker build -t opea/translation-nginx:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/nginx/Dockerfile .
+```
+
Then run the command `docker images`, you will have the following Docker Images:
1. `opea/llm-tgi:latest`
2. `opea/translation:latest`
3. `opea/translation-ui:latest`
+4. `opea/translation-nginx:latest`
## π Start Microservices
+### Required Models
+
+By default, the LLM model is set to a default value as listed below:
+
+| Service | Model |
+| ------- | ----------------- |
+| LLM | haoranxu/ALMA-13B |
+
+Change the `LLM_MODEL_ID` below for your needs.
+
### Setup Environment Variables
-Since the `compose.yaml` will consume some environment variables, you need to set up them in advance as below.
+1. Set the required environment variables:
-```bash
-export http_proxy=${your_http_proxy}
-export https_proxy=${your_http_proxy}
-export LLM_MODEL_ID="haoranxu/ALMA-13B"
-export TGI_LLM_ENDPOINT="http://${host_ip}:8008"
-export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
-export MEGA_SERVICE_HOST_IP=${host_ip}
-export LLM_SERVICE_HOST_IP=${host_ip}
-export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/translation"
-```
+ ```bash
+ # Example: host_ip="192.168.1.1"
+ export host_ip="External_Public_IP"
+ # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
+ export no_proxy="Your_No_Proxy"
+ export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
+ # Example: NGINX_PORT=80
+ export NGINX_PORT=${your_nginx_port}
+ ```
+
+2. If you are in a proxy environment, also set the proxy-related environment variables:
+
+ ```bash
+ export http_proxy="Your_HTTP_Proxy"
+ export https_proxy="Your_HTTPs_Proxy"
+ ```
+
+3. Set up other environment variables:
-Note: Please replace with `host_ip` with you external IP address, do not use localhost.
+ ```bash
+ cd ../../../
+ source set_env.sh
+ ```
### Start Microservice Docker Containers
@@ -99,6 +128,14 @@ docker compose up -d
"language_from": "Chinese","language_to": "English","source_language": "ζη±ζΊε¨ηΏ»θ―γ"}'
```
+4. Nginx Service
+
+ ```bash
+ curl http://${host_ip}:${NGINX_PORT}/v1/translation \
+ -H "Content-Type: application/json" \
+ -d '{"language_from": "Chinese","language_to": "English","source_language": "ζη±ζΊε¨ηΏ»θ―γ"}'
+ ```
+
Following the validation of all aforementioned microservices, we are now prepared to construct a mega-service.
## π Launch the UI
diff --git a/Translation/docker_compose/intel/cpu/xeon/compose.yaml b/Translation/docker_compose/intel/cpu/xeon/compose.yaml
index 4ba224bf3..e8eafca4f 100644
--- a/Translation/docker_compose/intel/cpu/xeon/compose.yaml
+++ b/Translation/docker_compose/intel/cpu/xeon/compose.yaml
@@ -8,10 +8,12 @@ services:
ports:
- "8008:80"
environment:
+ no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
- TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT}
- HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
+ HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
+ HF_HUB_DISABLE_PROGRESS_BARS: 1
+ HF_HUB_ENABLE_HF_TRANSFER: 0
volumes:
- "./data:/data"
shm_size: 1g
@@ -25,10 +27,13 @@ services:
- "9000:9000"
ipc: host
environment:
+ no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT}
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
+ HF_HUB_DISABLE_PROGRESS_BARS: 1
+ HF_HUB_ENABLE_HF_TRANSFER: 0
restart: unless-stopped
translation-xeon-backend-server:
image: ${REGISTRY:-opea}/translation:${TAG:-latest}
@@ -39,6 +44,7 @@ services:
ports:
- "8888:8888"
environment:
+ - no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- MEGA_SERVICE_HOST_IP=${MEGA_SERVICE_HOST_IP}
@@ -53,11 +59,31 @@ services:
ports:
- "5173:5173"
environment:
+ - no_proxy=${no_proxy}
- https_proxy=${https_proxy}
- http_proxy=${http_proxy}
- BASE_URL=${BACKEND_SERVICE_ENDPOINT}
ipc: host
restart: always
+ translation-xeon-nginx-server:
+ image: ${REGISTRY:-opea}/translation-nginx:${TAG:-latest}
+ container_name: translation-xeon-nginx-server
+ depends_on:
+ - translation-xeon-backend-server
+ - translation-xeon-ui-server
+ ports:
+ - "${NGINX_PORT:-80}:80"
+ environment:
+ - no_proxy=${no_proxy}
+ - https_proxy=${https_proxy}
+ - http_proxy=${http_proxy}
+ - FRONTEND_SERVICE_IP=${FRONTEND_SERVICE_IP}
+ - FRONTEND_SERVICE_PORT=${FRONTEND_SERVICE_PORT}
+ - BACKEND_SERVICE_NAME=${BACKEND_SERVICE_NAME}
+ - BACKEND_SERVICE_IP=${BACKEND_SERVICE_IP}
+ - BACKEND_SERVICE_PORT=${BACKEND_SERVICE_PORT}
+ ipc: host
+ restart: always
networks:
default:
driver: bridge
diff --git a/Translation/docker_compose/intel/hpu/gaudi/README.md b/Translation/docker_compose/intel/hpu/gaudi/README.md
index 1f8f82837..9f234496c 100644
--- a/Translation/docker_compose/intel/hpu/gaudi/README.md
+++ b/Translation/docker_compose/intel/hpu/gaudi/README.md
@@ -29,34 +29,63 @@ docker build -t opea/translation:latest --build-arg https_proxy=$https_proxy --b
Construct the frontend Docker image using the command below:
```bash
-cd GenAIExamples/Translation
+cd GenAIExamples/Translation/ui/
docker build -t opea/translation-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
```
+### 4. Build Nginx Docker Image
+
+```bash
+cd GenAIComps
+docker build -t opea/translation-nginx:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/nginx/Dockerfile .
+```
+
Then run the command `docker images`, you will have the following four Docker Images:
1. `opea/llm-tgi:latest`
2. `opea/translation:latest`
3. `opea/translation-ui:latest`
+4. `opea/translation-nginx:latest`
## π Start Microservices
+### Required Models
+
+By default, the LLM model is set to a default value as listed below:
+
+| Service | Model |
+| ------- | ----------------- |
+| LLM | haoranxu/ALMA-13B |
+
+Change the `LLM_MODEL_ID` below for your needs.
+
### Setup Environment Variables
-Since the `compose.yaml` will consume some environment variables, you need to setup them in advance as below.
+1. Set the required environment variables:
-```bash
-export http_proxy=${your_http_proxy}
-export https_proxy=${your_http_proxy}
-export LLM_MODEL_ID="haoranxu/ALMA-13B"
-export TGI_LLM_ENDPOINT="http://${host_ip}:8008"
-export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
-export MEGA_SERVICE_HOST_IP=${host_ip}
-export LLM_SERVICE_HOST_IP=${host_ip}
-export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/translation"
-```
+ ```bash
+ # Example: host_ip="192.168.1.1"
+ export host_ip="External_Public_IP"
+ # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
+ export no_proxy="Your_No_Proxy"
+ export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
+ # Example: NGINX_PORT=80
+ export NGINX_PORT=${your_nginx_port}
+ ```
+
+2. If you are in a proxy environment, also set the proxy-related environment variables:
+
+ ```bash
+ export http_proxy="Your_HTTP_Proxy"
+ export https_proxy="Your_HTTPs_Proxy"
+ ```
+
+3. Set up other environment variables:
-Note: Please replace with `host_ip` with you external IP address, do not use localhost.
+ ```bash
+ cd ../../../
+ source set_env.sh
+ ```
### Start Microservice Docker Containers
@@ -91,6 +120,14 @@ docker compose up -d
"language_from": "Chinese","language_to": "English","source_language": "ζη±ζΊε¨ηΏ»θ―γ"}'
```
+4. Nginx Service
+
+ ```bash
+ curl http://${host_ip}:${NGINX_PORT}/v1/translation \
+ -H "Content-Type: application/json" \
+ -d '{"language_from": "Chinese","language_to": "English","source_language": "ζη±ζΊε¨ηΏ»θ―γ"}'
+ ```
+
Following the validation of all aforementioned microservices, we are now prepared to construct a mega-service.
## π Launch the UI
diff --git a/Translation/docker_compose/intel/hpu/gaudi/compose.yaml b/Translation/docker_compose/intel/hpu/gaudi/compose.yaml
index 32dbfdc3e..6eefd6492 100644
--- a/Translation/docker_compose/intel/hpu/gaudi/compose.yaml
+++ b/Translation/docker_compose/intel/hpu/gaudi/compose.yaml
@@ -10,7 +10,6 @@ services:
environment:
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
- TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT}
HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
HF_HUB_DISABLE_PROGRESS_BARS: 1
HF_HUB_ENABLE_HF_TRANSFER: 0
@@ -36,6 +35,8 @@ services:
https_proxy: ${https_proxy}
TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT}
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
+ HF_HUB_DISABLE_PROGRESS_BARS: 1
+ HF_HUB_ENABLE_HF_TRANSFER: 0
restart: unless-stopped
translation-gaudi-backend-server:
image: ${REGISTRY:-opea}/translation:${TAG:-latest}
@@ -65,6 +66,25 @@ services:
- BASE_URL=${BACKEND_SERVICE_ENDPOINT}
ipc: host
restart: always
+ translation-gaudi-nginx-server:
+ image: ${REGISTRY:-opea}/translation-nginx:${TAG:-latest}
+ container_name: translation-gaudi-nginx-server
+ depends_on:
+ - translation-gaudi-backend-server
+ - translation-gaudi-ui-server
+ ports:
+ - "${NGINX_PORT:-80}:80"
+ environment:
+ - no_proxy=${no_proxy}
+ - https_proxy=${https_proxy}
+ - http_proxy=${http_proxy}
+ - FRONTEND_SERVICE_IP=${FRONTEND_SERVICE_IP}
+ - FRONTEND_SERVICE_PORT=${FRONTEND_SERVICE_PORT}
+ - BACKEND_SERVICE_NAME=${BACKEND_SERVICE_NAME}
+ - BACKEND_SERVICE_IP=${BACKEND_SERVICE_IP}
+ - BACKEND_SERVICE_PORT=${BACKEND_SERVICE_PORT}
+ ipc: host
+ restart: always
networks:
default:
diff --git a/Translation/docker_compose/set_env.sh b/Translation/docker_compose/set_env.sh
new file mode 100644
index 000000000..c82c8d360
--- /dev/null
+++ b/Translation/docker_compose/set_env.sh
@@ -0,0 +1,18 @@
+#!/usr/bin/env bash
+
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+
+export LLM_MODEL_ID="haoranxu/ALMA-13B"
+export TGI_LLM_ENDPOINT="http://${host_ip}:8008"
+export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
+export MEGA_SERVICE_HOST_IP=${host_ip}
+export LLM_SERVICE_HOST_IP=${host_ip}
+export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/translation"
+export NGINX_PORT=80
+export FRONTEND_SERVICE_IP=${host_ip}
+export FRONTEND_SERVICE_PORT=5173
+export BACKEND_SERVICE_NAME=translation
+export BACKEND_SERVICE_IP=${host_ip}
+export BACKEND_SERVICE_PORT=8888
diff --git a/Translation/docker_image_build/build.yaml b/Translation/docker_image_build/build.yaml
index b326b125b..a1562060b 100644
--- a/Translation/docker_image_build/build.yaml
+++ b/Translation/docker_image_build/build.yaml
@@ -23,3 +23,9 @@ services:
dockerfile: comps/llms/text-generation/tgi/Dockerfile
extends: translation
image: ${REGISTRY:-opea}/llm-tgi:${TAG:-latest}
+ nginx:
+ build:
+ context: GenAIComps
+ dockerfile: comps/nginx/Dockerfile
+ extends: translation
+ image: ${REGISTRY:-opea}/translation-nginx:${TAG:-latest}
diff --git a/Translation/kubernetes/intel/README.md b/Translation/kubernetes/intel/README.md
new file mode 100644
index 000000000..7ca89d372
--- /dev/null
+++ b/Translation/kubernetes/intel/README.md
@@ -0,0 +1,41 @@
+# Deploy Translation in Kubernetes Cluster
+
+> [NOTE]
+> The following values must be set before you can deploy:
+> HUGGINGFACEHUB_API_TOKEN
+>
+> You can also customize the "MODEL_ID" if needed.
+>
+> You need to make sure you have created the directory `/mnt/opea-models` to save the cached model on the node where the Translation workload is running. Otherwise, you need to modify the `translation.yaml` file to change the `model-volume` to a directory that exists on the node.
+
+## Deploy On Xeon
+
+```
+cd GenAIExamples/Translation/kubernetes/intel/cpu/xeon/manifests
+export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
+sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" translation.yaml
+kubectl apply -f translation.yaml
+```
+
+## Deploy On Gaudi
+
+```
+cd GenAIExamples/Translation/kubernetes/intel/hpu/gaudi/manifests
+export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
+sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" translation.yaml
+kubectl apply -f translation.yaml
+```
+
+## Verify Services
+
+To verify the installation, run the command `kubectl get pod` to make sure all pods are running.
+
+Then run the command `kubectl port-forward svc/translation 8888:8888` to expose the Translation service for access.
+
+Open another terminal and run the following command to verify the service if working:
+
+```console
+curl http://localhost:8888/v1/translation \
+ -H 'Content-Type: application/json' \
+ -d '{"language_from": "Chinese","language_to": "English","source_language": "ζη±ζΊε¨ηΏ»θ―γ"}'
+```
diff --git a/Translation/kubernetes/intel/cpu/xeon/manifest/translation.yaml b/Translation/kubernetes/intel/cpu/xeon/manifest/translation.yaml
new file mode 100644
index 000000000..e30fee338
--- /dev/null
+++ b/Translation/kubernetes/intel/cpu/xeon/manifest/translation.yaml
@@ -0,0 +1,495 @@
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: ConfigMap
+metadata:
+ name: translation-tgi-config
+ labels:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "2.1.0"
+data:
+ LLM_MODEL_ID: "haoranxu/ALMA-13B"
+ PORT: "2080"
+ HF_TOKEN: "insert-your-huggingface-token-here"
+ http_proxy: ""
+ https_proxy: ""
+ no_proxy: ""
+ HABANA_LOGS: "/tmp/habana_logs"
+ NUMBA_CACHE_DIR: "/tmp"
+ HF_HOME: "/tmp/.cache/huggingface"
+ CUDA_GRAPHS: "0"
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: ConfigMap
+metadata:
+ name: translation-llm-uservice-config
+ labels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+data:
+ TGI_LLM_ENDPOINT: "http://translation-tgi"
+ HUGGINGFACEHUB_API_TOKEN: "insert-your-huggingface-token-here"
+ http_proxy: ""
+ https_proxy: ""
+ no_proxy: ""
+ LOGFLAG: ""
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: ConfigMap
+metadata:
+ name: translation-ui-config
+ labels:
+ app.kubernetes.io/name: translation-ui
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+data:
+ BASE_URL: "/v1/translation"
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+data:
+ default.conf: |+
+ # Copyright (C) 2024 Intel Corporation
+ # SPDX-License-Identifier: Apache-2.0
+
+
+ server {
+ listen 80;
+ listen [::]:80;
+
+ location /home {
+ alias /usr/share/nginx/html/index.html;
+ }
+
+ location / {
+ proxy_pass http://translation-ui:5173;
+ proxy_set_header Host $host;
+ proxy_set_header X-Real-IP $remote_addr;
+ proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+ proxy_set_header X-Forwarded-Proto $scheme;
+ }
+
+ location /v1/translation {
+ proxy_pass http://translation:8888;
+ proxy_set_header Host $host;
+ proxy_set_header X-Real-IP $remote_addr;
+ proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+ proxy_set_header X-Forwarded-Proto $scheme;
+ }
+ }
+
+kind: ConfigMap
+metadata:
+ name: translation-nginx-config
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+ name: translation-ui
+ labels:
+ app.kubernetes.io/name: translation-ui
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+spec:
+ type: ClusterIP
+ ports:
+ - port: 5173
+ targetPort: ui
+ protocol: TCP
+ name: ui
+ selector:
+ app.kubernetes.io/name: translation-ui
+ app.kubernetes.io/instance: translation
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+ name: translation-llm-uservice
+ labels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+spec:
+ type: ClusterIP
+ ports:
+ - port: 9000
+ targetPort: 9000
+ protocol: TCP
+ name: llm-uservice
+ selector:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: translation
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+ name: translation-tgi
+ labels:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "2.1.0"
+spec:
+ type: ClusterIP
+ ports:
+ - port: 80
+ targetPort: 2080
+ protocol: TCP
+ name: tgi
+ selector:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: translation
+---
+apiVersion: v1
+kind: Service
+metadata:
+ name: translation-nginx
+spec:
+ ports:
+ - port: 80
+ protocol: TCP
+ targetPort: 80
+ selector:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app: translation-nginx
+ type: NodePort
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+ name: translation
+ labels:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+spec:
+ type: ClusterIP
+ ports:
+ - port: 8888
+ targetPort: 8888
+ protocol: TCP
+ name: translation
+ selector:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app: translation
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: translation-ui
+ labels:
+ app.kubernetes.io/name: translation-ui
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: translation-ui
+ app.kubernetes.io/instance: translation
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: translation-ui
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+ spec:
+ securityContext:
+ {}
+ containers:
+ - name: translation-ui
+ envFrom:
+ - configMapRef:
+ name: translation-ui-config
+ securityContext:
+ {}
+ image: "opea/translation-ui:latest"
+ imagePullPolicy: IfNotPresent
+ ports:
+ - name: ui
+ containerPort: 80
+ protocol: TCP
+ resources:
+ {}
+ volumeMounts:
+ - mountPath: /tmp
+ name: tmp
+ volumes:
+ - name: tmp
+ emptyDir: {}
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: translation-llm-uservice
+ labels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: translation
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: translation
+ spec:
+ securityContext:
+ {}
+ containers:
+ - name: translation
+ envFrom:
+ - configMapRef:
+ name: translation-llm-uservice-config
+ securityContext:
+ allowPrivilegeEscalation: false
+ capabilities:
+ drop:
+ - ALL
+ readOnlyRootFilesystem: false
+ runAsNonRoot: true
+ runAsUser: 1000
+ seccompProfile:
+ type: RuntimeDefault
+ image: "opea/llm-tgi:latest"
+ imagePullPolicy: IfNotPresent
+ ports:
+ - name: llm-uservice
+ containerPort: 9000
+ protocol: TCP
+ volumeMounts:
+ - mountPath: /tmp
+ name: tmp
+ livenessProbe:
+ failureThreshold: 24
+ httpGet:
+ path: v1/health_check
+ port: llm-uservice
+ initialDelaySeconds: 5
+ periodSeconds: 5
+ readinessProbe:
+ httpGet:
+ path: v1/health_check
+ port: llm-uservice
+ initialDelaySeconds: 5
+ periodSeconds: 5
+ startupProbe:
+ failureThreshold: 120
+ httpGet:
+ path: v1/health_check
+ port: llm-uservice
+ initialDelaySeconds: 5
+ periodSeconds: 5
+ resources:
+ {}
+ volumes:
+ - name: tmp
+ emptyDir: {}
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: translation-tgi
+ labels:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "2.1.0"
+spec:
+ # use explicit replica counts only of HorizontalPodAutoscaler is disabled
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: translation
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: translation
+ spec:
+ securityContext:
+ {}
+ containers:
+ - name: tgi
+ envFrom:
+ - configMapRef:
+ name: translation-tgi-config
+ securityContext:
+ allowPrivilegeEscalation: false
+ capabilities:
+ drop:
+ - ALL
+ readOnlyRootFilesystem: true
+ runAsNonRoot: true
+ runAsUser: 1000
+ seccompProfile:
+ type: RuntimeDefault
+ image: "ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu"
+ imagePullPolicy: IfNotPresent
+ volumeMounts:
+ - mountPath: /data
+ name: model-volume
+ - mountPath: /tmp
+ name: tmp
+ ports:
+ - name: http
+ containerPort: 2080
+ protocol: TCP
+ livenessProbe:
+ failureThreshold: 24
+ initialDelaySeconds: 5
+ periodSeconds: 5
+ tcpSocket:
+ port: http
+ readinessProbe:
+ initialDelaySeconds: 5
+ periodSeconds: 5
+ tcpSocket:
+ port: http
+ startupProbe:
+ failureThreshold: 120
+ initialDelaySeconds: 5
+ periodSeconds: 5
+ tcpSocket:
+ port: http
+ resources:
+ {}
+ volumes:
+ - name: model-volume
+ emptyDir: {}
+ - name: tmp
+ emptyDir: {}
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: translation
+ labels:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+ app: translation
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app: translation
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app: translation
+ spec:
+ securityContext:
+ null
+ containers:
+ - name: translation
+ env:
+ - name: LLM_SERVICE_HOST_IP
+ value: translation-llm-uservice
+ securityContext:
+ allowPrivilegeEscalation: false
+ capabilities:
+ drop:
+ - ALL
+ readOnlyRootFilesystem: true
+ runAsNonRoot: true
+ runAsUser: 1000
+ seccompProfile:
+ type: RuntimeDefault
+ image: "opea/translation:latest"
+ imagePullPolicy: IfNotPresent
+ volumeMounts:
+ - mountPath: /tmp
+ name: tmp
+ ports:
+ - name: translation
+ containerPort: 8888
+ protocol: TCP
+ resources:
+ null
+ volumes:
+ - name: tmp
+ emptyDir: {}
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: translation-nginx
+ labels:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+ app: translation-nginx
+spec:
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app: translation-nginx
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app: translation-nginx
+ spec:
+ containers:
+ - image: nginx:1.27.1
+ imagePullPolicy: IfNotPresent
+ name: nginx
+ volumeMounts:
+ - mountPath: /etc/nginx/conf.d
+ name: nginx-config-volume
+ securityContext: {}
+ volumes:
+ - configMap:
+ defaultMode: 420
+ name: translation-nginx-config
+ name: nginx-config-volume
diff --git a/Translation/kubernetes/intel/hpu/gaudi/manifest/translation.yaml b/Translation/kubernetes/intel/hpu/gaudi/manifest/translation.yaml
new file mode 100644
index 000000000..52d6c9b10
--- /dev/null
+++ b/Translation/kubernetes/intel/hpu/gaudi/manifest/translation.yaml
@@ -0,0 +1,497 @@
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: ConfigMap
+metadata:
+ name: translation-tgi-config
+ labels:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "2.1.0"
+data:
+ LLM_MODEL_ID: "haoranxu/ALMA-13B"
+ PORT: "2080"
+ HF_TOKEN: "insert-your-huggingface-token-here"
+ http_proxy: ""
+ https_proxy: ""
+ no_proxy: ""
+ HABANA_LOGS: "/tmp/habana_logs"
+ NUMBA_CACHE_DIR: "/tmp"
+ HF_HOME: "/tmp/.cache/huggingface"
+ MAX_INPUT_LENGTH: "1024"
+ MAX_TOTAL_TOKENS: "2048"
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: ConfigMap
+metadata:
+ name: translation-llm-uservice-config
+ labels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+data:
+ TGI_LLM_ENDPOINT: "http://translation-tgi"
+ HUGGINGFACEHUB_API_TOKEN: "insert-your-huggingface-token-here"
+ http_proxy: ""
+ https_proxy: ""
+ no_proxy: ""
+ LOGFLAG: ""
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: ConfigMap
+metadata:
+ name: translation-ui-config
+ labels:
+ app.kubernetes.io/name: translation-ui
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+data:
+ BASE_URL: "/v1/translation"
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+data:
+ default.conf: |+
+ # Copyright (C) 2024 Intel Corporation
+ # SPDX-License-Identifier: Apache-2.0
+
+
+ server {
+ listen 80;
+ listen [::]:80;
+
+ location /home {
+ alias /usr/share/nginx/html/index.html;
+ }
+
+ location / {
+ proxy_pass http://translation-ui:5173;
+ proxy_set_header Host $host;
+ proxy_set_header X-Real-IP $remote_addr;
+ proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+ proxy_set_header X-Forwarded-Proto $scheme;
+ }
+
+ location /v1/translation {
+ proxy_pass http://translation;
+ proxy_set_header Host $host;
+ proxy_set_header X-Real-IP $remote_addr;
+ proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+ proxy_set_header X-Forwarded-Proto $scheme;
+ }
+ }
+
+kind: ConfigMap
+metadata:
+ name: translation-nginx-config
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+ name: translation-ui
+ labels:
+ app.kubernetes.io/name: translation-ui
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+spec:
+ type: ClusterIP
+ ports:
+ - port: 5173
+ targetPort: ui
+ protocol: TCP
+ name: ui
+ selector:
+ app.kubernetes.io/name: translation-ui
+ app.kubernetes.io/instance: translation
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+ name: translation-llm-uservice
+ labels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+spec:
+ type: ClusterIP
+ ports:
+ - port: 9000
+ targetPort: 9000
+ protocol: TCP
+ name: llm-uservice
+ selector:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: translation
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+ name: translation-tgi
+ labels:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "2.1.0"
+spec:
+ type: ClusterIP
+ ports:
+ - port: 80
+ targetPort: 2080
+ protocol: TCP
+ name: tgi
+ selector:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: translation
+---
+apiVersion: v1
+kind: Service
+metadata:
+ name: translation-nginx
+spec:
+ ports:
+ - port: 80
+ protocol: TCP
+ targetPort: 80
+ selector:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app: translation-nginx
+ type: NodePort
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+ name: translation
+ labels:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+spec:
+ type: ClusterIP
+ ports:
+ - port: 8888
+ targetPort: 8888
+ protocol: TCP
+ name: translation
+ selector:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app: translation
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: translation-ui
+ labels:
+ app.kubernetes.io/name: translation-ui
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: translation-ui
+ app.kubernetes.io/instance: translation
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: translation-ui
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+ spec:
+ securityContext:
+ {}
+ containers:
+ - name: translation-ui
+ envFrom:
+ - configMapRef:
+ name: translation-ui-config
+ securityContext:
+ {}
+ image: "opea/translation-ui:latest"
+ imagePullPolicy: IfNotPresent
+ ports:
+ - name: ui
+ containerPort: 80
+ protocol: TCP
+ resources:
+ {}
+ volumeMounts:
+ - mountPath: /tmp
+ name: tmp
+ volumes:
+ - name: tmp
+ emptyDir: {}
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: translation-llm-uservice
+ labels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: translation
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: translation
+ spec:
+ securityContext:
+ {}
+ containers:
+ - name: translation
+ envFrom:
+ - configMapRef:
+ name: translation-llm-uservice-config
+ securityContext:
+ allowPrivilegeEscalation: false
+ capabilities:
+ drop:
+ - ALL
+ readOnlyRootFilesystem: false
+ runAsNonRoot: true
+ runAsUser: 1000
+ seccompProfile:
+ type: RuntimeDefault
+ image: "opea/llm-tgi:latest"
+ imagePullPolicy: IfNotPresent
+ ports:
+ - name: llm-uservice
+ containerPort: 9000
+ protocol: TCP
+ volumeMounts:
+ - mountPath: /tmp
+ name: tmp
+ livenessProbe:
+ failureThreshold: 24
+ httpGet:
+ path: v1/health_check
+ port: llm-uservice
+ initialDelaySeconds: 5
+ periodSeconds: 5
+ readinessProbe:
+ httpGet:
+ path: v1/health_check
+ port: llm-uservice
+ initialDelaySeconds: 5
+ periodSeconds: 5
+ startupProbe:
+ failureThreshold: 120
+ httpGet:
+ path: v1/health_check
+ port: llm-uservice
+ initialDelaySeconds: 5
+ periodSeconds: 5
+ resources:
+ {}
+ volumes:
+ - name: tmp
+ emptyDir: {}
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: translation-tgi
+ labels:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "2.1.0"
+spec:
+ # use explicit replica counts only of HorizontalPodAutoscaler is disabled
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: translation
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: translation
+ spec:
+ securityContext:
+ {}
+ containers:
+ - name: tgi
+ envFrom:
+ - configMapRef:
+ name: translation-tgi-config
+ securityContext:
+ allowPrivilegeEscalation: false
+ capabilities:
+ drop:
+ - ALL
+ readOnlyRootFilesystem: true
+ runAsNonRoot: true
+ runAsUser: 1000
+ seccompProfile:
+ type: RuntimeDefault
+ image: "ghcr.io/huggingface/tgi-gaudi:2.0.1"
+ imagePullPolicy: IfNotPresent
+ volumeMounts:
+ - mountPath: /data
+ name: model-volume
+ - mountPath: /tmp
+ name: tmp
+ ports:
+ - name: http
+ containerPort: 2080
+ protocol: TCP
+ livenessProbe:
+ failureThreshold: 24
+ initialDelaySeconds: 5
+ periodSeconds: 5
+ tcpSocket:
+ port: http
+ readinessProbe:
+ initialDelaySeconds: 5
+ periodSeconds: 5
+ tcpSocket:
+ port: http
+ startupProbe:
+ failureThreshold: 120
+ initialDelaySeconds: 20
+ periodSeconds: 5
+ tcpSocket:
+ port: http
+ resources:
+ limits:
+ habana.ai/gaudi: 1
+ volumes:
+ - name: model-volume
+ emptyDir: {}
+ - name: tmp
+ emptyDir: {}
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: translation
+ labels:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+ app: translation
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app: translation
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app: translation
+ spec:
+ securityContext:
+ null
+ containers:
+ - name: translation
+ env:
+ - name: LLM_SERVICE_HOST_IP
+ value: translation-llm-uservice
+ securityContext:
+ allowPrivilegeEscalation: false
+ capabilities:
+ drop:
+ - ALL
+ readOnlyRootFilesystem: true
+ runAsNonRoot: true
+ runAsUser: 1000
+ seccompProfile:
+ type: RuntimeDefault
+ image: "opea/translation:latest"
+ imagePullPolicy: IfNotPresent
+ volumeMounts:
+ - mountPath: /tmp
+ name: tmp
+ ports:
+ - name: translation
+ containerPort: 8888
+ protocol: TCP
+ resources:
+ null
+ volumes:
+ - name: tmp
+ emptyDir: {}
+---
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: translation-nginx
+ labels:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app.kubernetes.io/version: "v1.0"
+ app: translation-nginx
+spec:
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app: translation-nginx
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: translation
+ app.kubernetes.io/instance: translation
+ app: translation-nginx
+ spec:
+ containers:
+ - image: nginx:1.27.1
+ imagePullPolicy: IfNotPresent
+ name: nginx
+ volumeMounts:
+ - mountPath: /etc/nginx/conf.d
+ name: nginx-config-volume
+ securityContext: {}
+ volumes:
+ - configMap:
+ defaultMode: 420
+ name: translation-nginx-config
+ name: nginx-config-volume
diff --git a/Translation/tests/test_compose_on_gaudi.sh b/Translation/tests/test_compose_on_gaudi.sh
index f66af96cb..558ec9e28 100644
--- a/Translation/tests/test_compose_on_gaudi.sh
+++ b/Translation/tests/test_compose_on_gaudi.sh
@@ -19,7 +19,7 @@ function build_docker_images() {
git clone https://github.com/opea-project/GenAIComps.git && cd GenAIComps && git checkout "${opea_branch:-"main"}" && cd ../
echo "Build all the images with --no-cache, check docker_image_build.log for details..."
- service_list="translation translation-ui llm-tgi"
+ service_list="translation translation-ui llm-tgi nginx"
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.1
@@ -35,6 +35,12 @@ function start_services() {
export MEGA_SERVICE_HOST_IP=${ip_address}
export LLM_SERVICE_HOST_IP=${ip_address}
export BACKEND_SERVICE_ENDPOINT="http://${ip_address}:8888/v1/translation"
+ export NGINX_PORT=80
+ export FRONTEND_SERVICE_IP=${ip_address}
+ export FRONTEND_SERVICE_PORT=5173
+ export BACKEND_SERVICE_NAME=translation
+ export BACKEND_SERVICE_IP=${ip_address}
+ export BACKEND_SERVICE_PORT=8888
sed -i "s/backend_address/$ip_address/g" $WORKPATH/ui/svelte/.env
@@ -80,8 +86,6 @@ function validate_services() {
sleep 1s
}
-
-
function validate_microservices() {
# Check if the microservices are running correctly.
@@ -110,6 +114,14 @@ function validate_megaservice() {
"mega-translation" \
"translation-gaudi-backend-server" \
'{"language_from": "Chinese","language_to": "English","source_language": "ζη±ζΊε¨ηΏ»θ―γ"}'
+
+ # test the megeservice via nginx
+ validate_services \
+ "${ip_address}:80/v1/translation" \
+ "translation" \
+ "mega-translation-nginx" \
+ "translation-gaudi-nginx-server" \
+ '{"language_from": "Chinese","language_to": "English","source_language": "ζη±ζΊε¨ηΏ»θ―γ"}'
}
function validate_frontend() {
diff --git a/Translation/tests/test_compose_on_xeon.sh b/Translation/tests/test_compose_on_xeon.sh
index a648ba832..2d0c5306d 100644
--- a/Translation/tests/test_compose_on_xeon.sh
+++ b/Translation/tests/test_compose_on_xeon.sh
@@ -19,10 +19,10 @@ function build_docker_images() {
git clone https://github.com/opea-project/GenAIComps.git && cd GenAIComps && git checkout "${opea_branch:-"main"}" && cd ../
echo "Build all the images with --no-cache, check docker_image_build.log for details..."
- service_list="translation translation-ui llm-tgi"
+ service_list="translation translation-ui llm-tgi nginx"
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log
- docker pull ghcr.io/huggingface/text-generation-inference:1.4
+ docker pull ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu
docker images && sleep 1s
}
@@ -35,6 +35,12 @@ function start_services() {
export MEGA_SERVICE_HOST_IP=${ip_address}
export LLM_SERVICE_HOST_IP=${ip_address}
export BACKEND_SERVICE_ENDPOINT="http://${ip_address}:8888/v1/translation"
+ export NGINX_PORT=80
+ export FRONTEND_SERVICE_IP=${ip_address}
+ export FRONTEND_SERVICE_PORT=5173
+ export BACKEND_SERVICE_NAME=translation
+ export BACKEND_SERVICE_IP=${ip_address}
+ export BACKEND_SERVICE_PORT=8888
sed -i "s/backend_address/$ip_address/g" $WORKPATH/ui/svelte/.env
@@ -42,7 +48,8 @@ function start_services() {
docker compose up -d > ${LOG_PATH}/start_services_with_compose.log
n=0
- until [[ "$n" -ge 100 ]]; do
+ # wait long for llm model download
+ until [[ "$n" -ge 500 ]]; do
docker logs tgi-service > ${LOG_PATH}/tgi_service_start.log
if grep -q Connected ${LOG_PATH}/tgi_service_start.log; then
break
@@ -108,6 +115,14 @@ function validate_megaservice() {
"mega-translation" \
"translation-xeon-backend-server" \
'{"language_from": "Chinese","language_to": "English","source_language": "ζη±ζΊε¨ηΏ»θ―γ"}'
+
+ # test the megeservice via nginx
+ validate_services \
+ "${ip_address}:80/v1/translation" \
+ "translation" \
+ "mega-translation-nginx" \
+ "translation-xeon-nginx-server" \
+ '{"language_from": "Chinese","language_to": "English","source_language": "ζη±ζΊε¨ηΏ»θ―γ"}'
}
function validate_frontend() {
diff --git a/Translation/tests/test_manifest_on_gaudi.sh b/Translation/tests/test_manifest_on_gaudi.sh
new file mode 100755
index 000000000..6e4edbeb4
--- /dev/null
+++ b/Translation/tests/test_manifest_on_gaudi.sh
@@ -0,0 +1,91 @@
+#!/bin/bash
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+set -xe
+USER_ID=$(whoami)
+LOG_PATH=/home/$(whoami)/logs
+MOUNT_DIR=/home/$USER_ID/.cache/huggingface/hub
+IMAGE_REPO=${IMAGE_REPO:-}
+IMAGE_TAG=${IMAGE_TAG:-latest}
+
+function init_translation() {
+ # executed under path manifest/translation/xeon
+ # replace the mount dir "path: /mnt/model" with "path: $CHART_MOUNT"
+ find . -name '*.yaml' -type f -exec sed -i "s#path: /mnt/opea-models#path: $MOUNT_DIR#g" {} \;
+ if [ $CONTEXT == "CI" ]; then
+ # replace megaservice image tag
+ find . -name '*.yaml' -type f -exec sed -i "s#image: \"opea/translation:latest#image: \"opea/translation:${IMAGE_TAG}#g" {} \;
+ else
+ # replace microservice image tag
+ find . -name '*.yaml' -type f -exec sed -i "s#image: \"opea/\(.*\):latest#image: \"opea/\1:${IMAGE_TAG}#g" {} \;
+ fi
+ # replace the repository "image: opea/*" with "image: $IMAGE_REPO/opea/"
+ find . -name '*.yaml' -type f -exec sed -i "s#image: \"opea/*#image: \"${IMAGE_REPO}opea/#g" {} \;
+ # set huggingface token
+ find . -name '*.yaml' -type f -exec sed -i "s#insert-your-huggingface-token-here#$(cat /home/$USER_ID/.cache/huggingface/token)#g" {} \;
+}
+
+function install_translation {
+ echo "namespace is $NAMESPACE"
+ kubectl apply -f translation.yaml -n $NAMESPACE
+ sleep 50s
+}
+
+function validate_translation() {
+ ip_address=$(kubectl get svc $SERVICE_NAME -n $NAMESPACE -o jsonpath='{.spec.clusterIP}')
+ port=$(kubectl get svc $SERVICE_NAME -n $NAMESPACE -o jsonpath='{.spec.ports[0].port}')
+ echo "try to curl http://${ip_address}:${port}/v1/translation..."
+
+ # generate a random logfile name to avoid conflict among multiple runners
+ LOGFILE=$LOG_PATH/curlmega_$NAMESPACE.log
+ # Curl the Mega Service
+ curl http://${ip_address}:${port}/v1/translation \
+ -H 'Content-Type: application/json' \
+ -d '{"language_from": "Chinese","language_to": "English","source_language": "ζη±ζΊε¨ηΏ»θ―γ"}' > $LOGFILE
+ exit_code=$?
+ if [ $exit_code -ne 0 ]; then
+ echo "Megaservice translation failed, please check the logs in $LOGFILE!"
+ exit 1
+ fi
+
+ echo "Checking response results, make sure the output is reasonable. "
+ local status=false
+ if [[ -f $LOGFILE ]] && \
+ [[ $(grep -c "translation" $LOGFILE) != 0 ]]; then
+ status=true
+ fi
+
+ if [ $status == false ]; then
+ echo "Response check failed, please check the logs in artifacts!"
+ else
+ echo "Response check succeed!"
+ fi
+}
+
+if [ $# -eq 0 ]; then
+ echo "Usage: $0 "
+ exit 1
+fi
+
+case "$1" in
+ init_Translation)
+ pushd Translation/kubernetes/intel/hpu/gaudi/manifest
+ init_translation
+ popd
+ ;;
+ install_Translation)
+ pushd Translation/kubernetes/intel/hpu/gaudi/manifest
+ NAMESPACE=$2
+ install_translation
+ popd
+ ;;
+ validate_Translation)
+ NAMESPACE=$2
+ SERVICE_NAME=translation
+ validate_translation
+ ;;
+ *)
+ echo "Unknown function: $1"
+ ;;
+esac
diff --git a/Translation/tests/test_manifest_on_xeon.sh b/Translation/tests/test_manifest_on_xeon.sh
new file mode 100755
index 000000000..34f04f5ab
--- /dev/null
+++ b/Translation/tests/test_manifest_on_xeon.sh
@@ -0,0 +1,90 @@
+#!/bin/bash
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+set -xe
+USER_ID=$(whoami)
+LOG_PATH=/home/$(whoami)/logs
+MOUNT_DIR=/home/$USER_ID/.cache/huggingface/hub
+IMAGE_REPO=${IMAGE_REPO:-}
+IMAGE_TAG=${IMAGE_TAG:-latest}
+
+function init_translation() {
+ # executed under path manifest/translation/xeon
+ # replace the mount dir "path: /mnt/model" with "path: $CHART_MOUNT"
+ find . -name '*.yaml' -type f -exec sed -i "s#path: /mnt/opea-models#path: $MOUNT_DIR#g" {} \;
+ if [ $CONTEXT == "CI" ]; then
+ # replace megaservice image tag
+ find . -name '*.yaml' -type f -exec sed -i "s#image: \"opea/translation:latest#image: \"opea/translation:${IMAGE_TAG}#g" {} \;
+ else
+ # replace microservice image tag
+ find . -name '*.yaml' -type f -exec sed -i "s#image: \"opea/\(.*\):latest#image: \"opea/\1:${IMAGE_TAG}#g" {} \;
+ fi
+ # replace the repository "image: opea/*" with "image: $IMAGE_REPO/opea/"
+ find . -name '*.yaml' -type f -exec sed -i "s#image: \"opea/*#image: \"${IMAGE_REPO}opea/#g" {} \;
+ # set huggingface token
+ find . -name '*.yaml' -type f -exec sed -i "s#insert-your-huggingface-token-here#$(cat /home/$USER_ID/.cache/huggingface/token)#g" {} \;
+}
+
+function install_translation {
+ echo "namespace is $NAMESPACE"
+ kubectl apply -f translation.yaml -n $NAMESPACE
+}
+
+function validate_translation() {
+ ip_address=$(kubectl get svc $SERVICE_NAME -n $NAMESPACE -o jsonpath='{.spec.clusterIP}')
+ port=$(kubectl get svc $SERVICE_NAME -n $NAMESPACE -o jsonpath='{.spec.ports[0].port}')
+ echo "try to curl http://${ip_address}:${port}/v1/translation..."
+
+ # generate a random logfile name to avoid conflict among multiple runners
+ LOGFILE=$LOG_PATH/curlmega_$NAMESPACE.log
+ # Curl the Mega Service
+ curl http://${ip_address}:${port}/v1/translation \
+ -H 'Content-Type: application/json' \
+ -d '{"language_from": "Chinese","language_to": "English","source_language": "ζη±ζΊε¨ηΏ»θ―γ"}' > $LOGFILE
+ exit_code=$?
+ if [ $exit_code -ne 0 ]; then
+ echo "Megaservice translation failed, please check the logs in $LOGFILE!"
+ exit 1
+ fi
+
+ echo "Checking response results, make sure the output is reasonable. "
+ local status=false
+ if [[ -f $LOGFILE ]] && \
+ [[ $(grep -c "translation" $LOGFILE) != 0 ]]; then
+ status=true
+ fi
+
+ if [ $status == false ]; then
+ echo "Response check failed, please check the logs in artifacts!"
+ else
+ echo "Response check succeed!"
+ fi
+}
+
+if [ $# -eq 0 ]; then
+ echo "Usage: $0 "
+ exit 1
+fi
+
+case "$1" in
+ init_Translation)
+ pushd Translation/kubernetes/intel/cpu/xeon/manifest
+ init_translation
+ popd
+ ;;
+ install_Translation)
+ pushd Translation/kubernetes/intel/cpu/xeon/manifest
+ NAMESPACE=$2
+ install_translation
+ popd
+ ;;
+ validate_Translation)
+ NAMESPACE=$2
+ SERVICE_NAME=translation
+ validate_translation
+ ;;
+ *)
+ echo "Unknown function: $1"
+ ;;
+esac
diff --git a/VisualQnA/README.md b/VisualQnA/README.md
index 3fe738754..99cdb26a2 100644
--- a/VisualQnA/README.md
+++ b/VisualQnA/README.md
@@ -13,7 +13,7 @@ General architecture of VQA shows below:
![VQA](./assets/img/vqa.png)
-This example guides you through how to deploy a [LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT) (Open Large Multimodal Models) model on Intel Gaudi2 to do visual question and answering task. The Intel Gaudi2 accelerator supports both training and inference for deep learning models in particular for LLMs. Please visit [Habana AI products](https://habana.ai/products/) for more details.
+This example guides you through how to deploy a [LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT) (Open Large Multimodal Models) model on [Intel Gaudi2](https://www.intel.com/content/www/us/en/products/details/processors/ai-accelerators/gaudi-overview.html) and [Intel Xeon Scalable Processors](https://www.intel.com/content/www/us/en/products/details/processors/xeon.html). We invite contributions from other hardware vendors to expand the OPEA ecosystem.
![llava screenshot](./assets/img/llava_screenshot1.png)
![llava-screenshot](./assets/img/llava_screenshot2.png)
diff --git a/supported_examples.md b/supported_examples.md
index fe2965bdf..42a0a60e2 100644
--- a/supported_examples.md
+++ b/supported_examples.md
@@ -6,13 +6,58 @@ This document introduces the supported examples of GenAIExamples. The supported
[ChatQnA](./ChatQnA/README.md) is an example of chatbot for question and answering through retrieval augmented generation (RAG).
-| Framework | LLM | Embedding | Vector Database | Serving | HW | Description |
-| ------------------------------------------------------------------------------ | ----------------------------------------------------------------- | --------------------------------------------------- | ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------- | --------------- | ----------- |
-| [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [NeuralChat-7B](https://huggingface.co/Intel/neural-chat-7b-v3-3) | [BGE-Base](https://huggingface.co/BAAI/bge-base-en) | [Redis](https://redis.io/) | [TGI](https://github.com/huggingface/text-generation-inference) [TEI](https://github.com/huggingface/text-embeddings-inference) | Xeon/Gaudi2/GPU | Chatbot |
-| [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [NeuralChat-7B](https://huggingface.co/Intel/neural-chat-7b-v3-3) | [BGE-Base](https://huggingface.co/BAAI/bge-base-en) | [Chroma](https://www.trychroma.com/) | [TGI](https://github.com/huggingface/text-generation-inference) [TEI](https://github.com/huggingface/text-embeddings-inference) | Xeon/Gaudi2 | Chatbot |
-| [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1) | [BGE-Base](https://huggingface.co/BAAI/bge-base-en) | [Redis](https://redis.io/) | [TGI](https://github.com/huggingface/text-generation-inference) [TEI](https://github.com/huggingface/text-embeddings-inference) | Xeon/Gaudi2 | Chatbot |
-| [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1) | [BGE-Base](https://huggingface.co/BAAI/bge-base-en) | [Qdrant](https://qdrant.tech/) | [TGI](https://github.com/huggingface/text-generation-inference) [TEI](https://github.com/huggingface/text-embeddings-inference) | Xeon/Gaudi2 | Chatbot |
-| [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [Qwen2-7B](https://huggingface.co/Qwen/Qwen2-7B) | [BGE-Base](https://huggingface.co/BAAI/bge-base-en) | [Redis](https://redis.io/) | [TEI](https://github.com/huggingface/text-embeddings-inference) | Xeon/Gaudi2 | Chatbot |
+
### CodeGen
@@ -101,7 +146,7 @@ The DocRetriever example demonstrates how to match user queries with free-text r
| Framework | Embedding | Vector Database | Serving | HW | Description |
| ------------------------------------------------------------------------------ | --------------------------------------------------- | -------------------------- | --------------------------------------------------------------- | ----------- | -------------------------- |
-| [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [BGE-Base](https://huggingface.co/BAAI/bge-base-en) | [Redis](https://redis.io/) | [TEI](https://github.com/huggingface/text-embeddings-inference) | Xeon/Gaudi2 | Document Retrieval Service |
+| [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [BGE-Base](https://huggingface.co/BAAI/bge-base-en) | [Redis](https://redis.io/) | [TEI](https://github.com/huggingface/text-embeddings-inference) | Xeon/Gaudi2 | Document Retrieval service |
### AgentQnA
@@ -110,3 +155,39 @@ The AgentQnA example demonstrates a hierarchical, multi-agent system designed fo
Worker agent uses open-source websearch tool (duckduckgo), agents use OpenAI GPT-4o-mini as llm backend.
> **_NOTE:_** This example is in active development. The code structure of these use cases are subject to change.
+
+### AudioQnA
+
+The AudioQnA example demonstrates the integration of Generative AI (GenAI) models for performing question-answering (QnA) on audio files, with the added functionality of Text-to-Speech (TTS) for generating spoken responses. The example showcases how to convert audio input to text using Automatic Speech Recognition (ASR), generate answers to user queries using a language model, and then convert those answers back to speech using Text-to-Speech (TTS).
+
+
+
+### FaqGen
+
+FAQ Generation Application leverages the power of large language models (LLMs) to revolutionize the way you interact with and comprehend complex textual data. By harnessing cutting-edge natural language processing techniques, our application can automatically generate comprehensive and natural-sounding frequently asked questions (FAQs) from your documents, legal texts, customer queries, and other sources. In this example use case, we utilize LangChain to implement FAQ Generation and facilitate LLM inference using Text Generation Inference on Intel Xeon and Gaudi2 processors.
+| Framework | LLM | Serving | HW | Description |
+| ------------------------------------------------------------------------------ | ----------------------------------------------------------------- | --------------------------------------------------------------- | ----------- | ----------- |
+| [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | [TGI](https://github.com/huggingface/text-generation-inference) | Xeon/Gaudi2 | Chatbot |
+
+### MultimodalQnA
+
+[MultimodalQnA](./MultimodalQnA/README.md) addresses your questions by dynamically fetching the most pertinent multimodal information (frames, transcripts, and/or captions) from your collection of videos.
+
+### ProductivitySuite
+
+[Productivity Suite](./ProductivitySuite/README.md) streamlines your workflow to boost productivity. It leverages the OPEA microservices to provide a comprehensive suite of features to cater to the diverse needs of modern enterprises.