Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pinecone update to Readme and docker compose for ChatQnA #540

Merged
merged 61 commits into from
Nov 13, 2024
Merged
Show file tree
Hide file tree
Changes from 51 commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
c20b821
Added the pinecone support documentation and docker compose file
pallavijaini0525 Aug 6, 2024
4c60074
Updated the readme for pinecone
pallavijaini0525 Aug 7, 2024
6b8cbe8
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 7, 2024
afcb18c
Added port 6009 to pinecone
pallavijaini0525 Aug 8, 2024
66ea870
Added the tests and renamed the files
pallavijaini0525 Aug 13, 2024
1e5e933
Updated the pinecone tests as per the xeon
Sep 11, 2024
098a4ce
Updated the Dockerfile pathf or rerank
Sep 11, 2024
c87911c
Updated the file name for the tests
Sep 13, 2024
2698d7c
Corrected the docker compose for pinecone
Sep 13, 2024
63ed279
Corrected the docker compose for pinecone
Sep 13, 2024
5723ccf
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 13, 2024
1622246
Corrected the test script issue
Sep 14, 2024
5730581
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 14, 2024
4753a8d
Fixed the issues with tests
Sep 17, 2024
7ba78b1
Fixed the links
Sep 17, 2024
e37b384
Corrected the tag
Sep 18, 2024
48c52c2
Corrected the ports
Sep 18, 2024
a44f173
Commented out dataprep as not able to connect to pineconeserver
pallavijaini0525 Sep 24, 2024
7091270
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 24, 2024
f07f0d7
Added more comments
pallavijaini0525 Sep 25, 2024
8c6326f
Validated in IDC and enabled all the tests in pinecone
Oct 15, 2024
93e3780
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 15, 2024
9c30df4
Added the pinecone support documentation and docker compose file
pallavijaini0525 Aug 6, 2024
22e918f
Updated the readme for pinecone
pallavijaini0525 Aug 7, 2024
20adebe
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 7, 2024
6bae689
Added port 6009 to pinecone
pallavijaini0525 Aug 8, 2024
aa21352
Added the tests and renamed the files
pallavijaini0525 Aug 13, 2024
d0b1ba3
Updated the pinecone tests as per the xeon
Sep 11, 2024
1345ea8
Updated the Dockerfile pathf or rerank
Sep 11, 2024
4133cfa
Updated the file name for the tests
Sep 13, 2024
1a3247c
Corrected the docker compose for pinecone
Sep 13, 2024
49b8708
Corrected the docker compose for pinecone
Sep 13, 2024
75135b0
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 13, 2024
d5da891
Corrected the test script issue
Sep 14, 2024
468957d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 14, 2024
17fd96b
Fixed the issues with tests
Sep 17, 2024
f32236f
Fixed the links
Sep 17, 2024
429be00
Corrected the tag
Sep 18, 2024
1ab8626
Corrected the ports
Sep 18, 2024
05a6b65
Commented out dataprep as not able to connect to pineconeserver
pallavijaini0525 Sep 24, 2024
70b7159
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 24, 2024
7628f33
Added more comments
pallavijaini0525 Sep 25, 2024
98d8eb4
Validated in IDC and enabled all the tests in pinecone
Oct 15, 2024
19f95bf
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 15, 2024
1a59f91
Merge remote-tracking branch 'origin/pinecone_update' into pinecone_u…
Oct 16, 2024
6aa1720
Implemented all review comments
Oct 16, 2024
822d109
Updated the pinecone key
Oct 16, 2024
a83b36f
Merge branch 'main' into pinecone_update
chensuyue Oct 16, 2024
9b14a97
Merge branch 'main' into pinecone_update
chensuyue Oct 17, 2024
15362dc
Made changes to reflect the new framework
pallavijaini0525 Oct 18, 2024
0b559c6
Fixed the typos
pallavijaini0525 Oct 18, 2024
a722ba1
added more logs
pallavijaini0525 Oct 29, 2024
700671c
Updated the log path
pallavijaini0525 Oct 29, 2024
810a70b
Updated the log path
pallavijaini0525 Oct 29, 2024
9b5a7e6
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 29, 2024
f44b9ed
Merge branch 'main' into pinecone_update
pallavijaini0525 Nov 12, 2024
53e5013
Updated the llm image
pallavijaini0525 Nov 12, 2024
2dbdffa
Updated the llm image
pallavijaini0525 Nov 12, 2024
2dd0339
uncommented building the images
pallavijaini0525 Nov 12, 2024
de13ffe
uncommented building the images
pallavijaini0525 Nov 12, 2024
d7b040a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
382 changes: 382 additions & 0 deletions ChatQnA/docker_compose/intel/cpu/xeon/README_pinecone.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,382 @@
# Build Mega Service of ChatQnA (with Pinecone) on Xeon

This document outlines the deployment process for a ChatQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Xeon server. The steps include Docker image creation, container deployment via Docker Compose, and service execution to integrate microservices such as `embedding`, `retriever`, `rerank`, and `llm`. We will publish the Docker images to Docker Hub soon, it will simplify the deployment process for this service.

## 🚀 Apply Xeon Server on AWS

To apply a Xeon server on AWS, start by creating an AWS account if you don't have one already. Then, head to the [EC2 Console](https://console.aws.amazon.com/ec2/v2/home) to begin the process. Within the EC2 service, select the Amazon EC2 M7i or M7i-flex instance type to leverage the power of 4th Generation Intel Xeon Scalable processors. These instances are optimized for high-performance computing and demanding workloads.

For detailed information about these instance types, you can refer to this [link](https://aws.amazon.com/ec2/instance-types/m7i/). Once you've chosen the appropriate instance type, proceed with configuring your instance settings, including network configurations, security groups, and storage options.

After launching your instance, you can connect to it using SSH (for Linux instances) or Remote Desktop Protocol (RDP) (for Windows instances). From there, you'll have full access to your Xeon server, allowing you to install, configure, and manage your applications as needed.

**Certain ports in the EC2 instance need to opened up in the security group, for the microservices to work with the curl commands**

> See one example below. Please open up these ports in the EC2 instance based on the IP addresses you want to allow

```

data_prep_service
=====================
Port 6007 - Open to 0.0.0.0/0
Port 6008 - Open to 0.0.0.0/0

tei_embedding_service
=====================
Port 6006 - Open to 0.0.0.0/0

embedding
=========
Port 6000 - Open to 0.0.0.0/0

retriever
=========
Port 7000 - Open to 0.0.0.0/0

tei_xeon_service
================
Port 8808 - Open to 0.0.0.0/0

reranking
=========
Port 8000 - Open to 0.0.0.0/0

tgi-service
===========
Port 9009 - Open to 0.0.0.0/0

llm
===
Port 9000 - Open to 0.0.0.0/0

chaqna-xeon-backend-server
==========================
Port 8888 - Open to 0.0.0.0/0

chaqna-xeon-ui-server
=====================
Port 5173 - Open to 0.0.0.0/0
```

## 🚀 Build Docker Images

First of all, you need to build Docker Images locally and install the python package of it.

```bash
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps
```

### 1. Build Embedding Image

```bash
docker build --no-cache -t opea/embedding-tei:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/tei/langchain/Dockerfile .
```

### 2. Build Retriever Image

```bash
docker build --no-cache -t opea/retriever-pinecone:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/pinecone/langchain/Dockerfile .
```

### 3. Build Rerank Image

```bash
docker build --no-cache -t opea/reranking-tei:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/reranks/tei/Dockerfile .
```

### 4. Build LLM Image

```bash
docker build --no-cache -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile .
```

### 5. Build Dataprep Image

```bash
docker build --no-cache -t opea/dataprep-pinecone:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/pinecone/langchain/Dockerfile .
cd ..
```

### 6. Build MegaService Docker Image

To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna.py` Python script. Build MegaService Docker image via below command:

```bash
git clone https://github.com/opea-project/GenAIExamples.git
cd GenAIExamples/ChatQnA/docker
docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
cd ../../..
```

### 7. Build UI Docker Image

Build frontend Docker image via below command:

```bash
cd GenAIExamples/ChatQnA/docker/ui/
docker build --no-cache -t opea/chatqna-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
cd ../../../..
```

### 8. Build Conversational React UI Docker Image (Optional)

Build frontend Docker image that enables Conversational experience with ChatQnA megaservice via below command:

**Export the value of the public IP address of your Xeon server to the `host_ip` environment variable**

```bash
cd GenAIExamples/ChatQnA/docker/ui/
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/chatqna"
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep"
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6008/v1/dataprep/get_file"
docker build --no-cache -t opea/chatqna-conversation-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy --build-arg BACKEND_SERVICE_ENDPOINT=$BACKEND_SERVICE_ENDPOINT --build-arg DATAPREP_SERVICE_ENDPOINT=$DATAPREP_SERVICE_ENDPOINT --build-arg DATAPREP_GET_FILE_ENDPOINT=$DATAPREP_GET_FILE_ENDPOINT -f ./docker/Dockerfile.react .
cd ../../../..
```

Then run the command `docker images`, you will have the following 7 Docker Images:

1. `opea/dataprep-pinecone:latest`
2. `opea/embedding-tei:latest`
3. `opea/retriever-pinecone:latest`
4. `opea/reranking-tei:latest`
5. `opea/llm-tgi:latest`
6. `opea/chatqna:latest`
7. `opea/chatqna-ui:latest`

## 🚀 Start Microservices

### Setup Environment Variables

Since the `compose_pinecone.yaml` will consume some environment variables, you need to setup them in advance as below.

**Export the value of the public IP address of your Xeon server to the `host_ip` environment variable**

> Change the External_Public_IP below with the actual IPV4 value

```
export host_ip="External_Public_IP"
```

**Export the value of your Huggingface API token to the `your_hf_api_token` environment variable**

> Change the Your_Huggingface_API_Token below with tyour actual Huggingface API Token value

```
export your_hf_api_token="Your_Huggingface_API_Token"
```

**Append the value of the public IP address to the no_proxy list**

```
export your_no_proxy=${your_no_proxy},"External_Public_IP"
```

\*\*Get the PINECONE_API_KEY and the INDEX_NAME

```
export pinecone_api_key=${api_key}
export pinecone_index_name=${pinecone_index}
```

```bash
export no_proxy=${your_no_proxy}
export http_proxy=${your_http_proxy}
export https_proxy=${your_http_proxy}
export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
export RERANK_MODEL_ID="BAAI/bge-reranker-base"
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:6006"
export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808"
export TGI_LLM_ENDPOINT="http://${host_ip}:9009"
export PINECONE_API_KEY=${pinecone_api_key}
export PINECONE_INDEX_NAME=${pinecone_index_name}
export INDEX_NAME=${pinecone_index_name}
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
export MEGA_SERVICE_HOST_IP=${host_ip}
export EMBEDDING_SERVICE_HOST_IP=${host_ip}
export RETRIEVER_SERVICE_HOST_IP=${host_ip}
export RERANK_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
export BACKEND_SERVICE_ENDPOINT="http://${host_ip}:8888/v1/chatqna"
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:6007/v1/dataprep"
export DATAPREP_GET_FILE_ENDPOINT="http://${host_ip}:6008/v1/dataprep/get_file"
export DATAPREP_DELETE_FILE_ENDPOINT="http://${host_ip}:6009/v1/dataprep/delete_file"
```

Note: Please replace with `host_ip` with you external IP address, do not use localhost.

### Start all the services Docker Containers

> Before running the docker compose command, you need to be in the folder that has the docker compose yaml file

```bash
cd GenAIExamples/ChatQnA/docker/xeon/
docker compose -f compose_pinecone.yaml up -d
```

### Validate Microservices

1. TEI Embedding Service

```bash
curl ${host_ip}:6006/embed \
-X POST \
-d '{"inputs":"What is Deep Learning?"}' \
-H 'Content-Type: application/json'
```

2. Embedding Microservice

```bash
curl http://${host_ip}:6000/v1/embeddings\
-X POST \
-d '{"text":"hello"}' \
-H 'Content-Type: application/json'
```

3. Retriever Microservice
To validate the retriever microservice, you need to generate a mock embedding vector of length 768 in Python script:

```Python
import random
embedding = [random.uniform(-1, 1) for _ in range(768)]
print(embedding)
```

Then substitute your mock embedding vector for the `${your_embedding}` in the following cURL command:

```bash
curl http://${host_ip}:7000/v1/retrieval \
-X POST \
-d '{"text":"What is the revenue of Nike in 2023?","embedding":"'"${your_embedding}"'"}' \
-H 'Content-Type: application/json'
```

4. TEI Reranking Service

```bash
curl http://${host_ip}:8808/rerank \
-X POST \
-d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \
-H 'Content-Type: application/json'
```

5. Reranking Microservice

```bash
curl http://${host_ip}:8000/v1/reranking\
-X POST \
-d '{"initial_query":"What is Deep Learning?", "retrieved_docs": [{"text":"Deep Learning is not..."}, {"text":"Deep learning is..."}]}' \
-H 'Content-Type: application/json'
```

6. TGI Service

```bash
curl http://${host_ip}:9009/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
-H 'Content-Type: application/json'
```

7. LLM Microservice

```bash
curl http://${host_ip}:9000/v1/chat/completions\
-X POST \
-d '{"query":"What is Deep Learning?","max_new_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \
-H 'Content-Type: application/json'
```

8. MegaService

```bash
curl http://${host_ip}:8888/v1/chatqna -H "Content-Type: application/json" -d '{
"messages": "What is the revenue of Nike in 2023?"
}'
```

9. Dataprep Microservice(Optional)

If you want to update the default knowledge base, you can use the following commands:

Update Knowledge Base via Local File Upload:

```bash
curl -X POST "http://${host_ip}:6007/v1/dataprep" \
-H "Content-Type: multipart/form-data" \
-F "files=@./nke-10k-2023.pdf"
```

This command updates a knowledge base by uploading a local file for processing. Update the file path according to your environment.

Add Knowledge Base via HTTP Links:

```bash
curl -X POST "http://${host_ip}:6007/v1/dataprep" \
-H "Content-Type: multipart/form-data" \
-F 'link_list=["https://opea.dev"]'
```

This command updates a knowledge base by submitting a list of HTTP links for processing.

Also, you are able to get the file list that you uploaded:

```bash
curl -X POST "http://${host_ip}:6008/v1/dataprep/get_file" \
-H "Content-Type: application/json"
```

## Enable LangSmith for Monotoring Application (Optional)

LangSmith offers tools to debug, evaluate, and monitor language models and intelligent agents. It can be used to assess benchmark data for each microservice. Before launching your services with `docker compose -f compose_pinecone.yaml up -d`, you need to enable LangSmith tracing by setting the `LANGCHAIN_TRACING_V2` environment variable to true and configuring your LangChain API key.

Here's how you can do it:

1. Install the latest version of LangSmith:

```bash
pip install -U langsmith
```

2. Set the necessary environment variables:

```bash
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=ls_...
```

## 🚀 Launch the UI

To access the frontend, open the following URL in your browser: http://{host_ip}:5173. By default, the UI runs on port 5173 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `compose.yaml` file as shown below:

```yaml
chaqna-gaudi-ui-server:
image: opea/chatqna-ui:latest
...
ports:
- "80:5173"
```

## 🚀 Launch the Conversational UI (react)

To access the Conversational UI frontend, open the following URL in your browser: http://{host_ip}:5174. By default, the UI runs on port 80 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `compose.yaml` file as shown below:

```yaml
chaqna-xeon-conversation-ui-server:
image: opea/chatqna-conversation-ui:latest
...
ports:
- "80:80"
```

![project-screenshot](../../../../assets/img/chat_ui_init.png)

Here is an example of running ChatQnA:

![project-screenshot](../../../../assets/img/chat_ui_response.png)

Here is an example of running ChatQnA with Conversational UI (React):

![project-screenshot](../../../../assets/img/conversation_ui_response.png)
Loading
Loading