diff --git a/examples/ChatQnA/deploy/gaudi.md b/examples/ChatQnA/deploy/gaudi.md index aa59a165..e0679cd9 100644 --- a/examples/ChatQnA/deploy/gaudi.md +++ b/examples/ChatQnA/deploy/gaudi.md @@ -47,11 +47,13 @@ To summarize, Below is the flow of contents we will be covering in this tutorial First step is to clone the GenAIExamples and GenAIComps. GenAIComps are fundamental necessary components used to build examples you find in -GenAIExamples and deploy them as microservices. +GenAIExamples and deploy them as microservices. Also set the `TAG` +environment variable with the version. ``` git clone https://github.com/opea-project/GenAIComps.git git clone https://github.com/opea-project/GenAIExamples.git +export TAG=1.1 ``` The examples utilize model weights from HuggingFace and langchain. @@ -87,50 +89,51 @@ ChatQnA megaservice, and UI (conversational React UI is optional). In total, there are 8 required and an optional docker images. ### Build/Pull Microservice images -::::{tab-set} -:::{tab-item} Pull + +::::::{tab-set} + +:::::{tab-item} Pull :sync: Pull -To pull pre-built docker images on Docker Hub, proceed to the next step. To customize -your application, you can choose to build individual docker images for the microservices -before proceeding. -::: -:::{tab-item} Build +If you decide to pull the docker containers and not build them locally, +you can proceed to the next step where all the necessary containers will +be pulled in from dockerhub. + +::::: +:::::{tab-item} Build :sync: Build From within the `GenAIComps` folder, checkout the release tag. ``` cd GenAIComps -git checkout tags/v1.0 +git checkout tags/v${TAG} ``` -::: -:::: #### Build Dataprep Image ```bash -docker build --no-cache -t opea/dataprep-redis:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/redis/langchain/Dockerfile . +docker build --no-cache -t opea/dataprep-redis:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/redis/langchain/Dockerfile . ``` #### Build Embedding Image ```bash -docker build --no-cache -t opea/embedding-tei:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/tei/langchain/Dockerfile . +docker build --no-cache -t opea/embedding-tei:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/tei/langchain/Dockerfile . ``` #### Build Retriever Image ```bash -docker build --no-cache -t opea/retriever-redis:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/redis/langchain/Dockerfile . +docker build --no-cache -t opea/retriever-redis:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/redis/langchain/Dockerfile . ``` #### Build Rerank Image ```bash -docker build --no-cache -t opea/reranking-tei:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/reranks/tei/Dockerfile . +docker build --no-cache -t opea/reranking-tei:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/reranks/tei/Dockerfile . ``` -#### Build docker +#### Build LLM Image ::::{tab-set} @@ -139,12 +142,12 @@ docker build --no-cache -t opea/reranking-tei:latest --build-arg https_proxy=$ht Build vLLM docker image with hpu support ``` -docker build --no-cache -t opea/llm-vllm-hpu:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/vllm/langchain/dependency/Dockerfile.intel_hpu . +docker build --no-cache -t opea/llm-vllm-hpu:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/vllm/langchain/dependency/Dockerfile.intel_hpu . ``` Build vLLM Microservice image ``` -docker build --no-cache -t opea/llm-vllm:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/vllm/langchain/Dockerfile . +docker build --no-cache -t opea/llm-vllm:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/vllm/langchain/Dockerfile . cd .. ``` ::: @@ -152,7 +155,7 @@ cd .. :sync: TGI ```bash -docker build --no-cache -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile . +docker build --no-cache -t opea/llm-tgi:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile . ``` ::: :::: @@ -164,7 +167,7 @@ Since a TEI Gaudi Docker image hasn't been published, we'll need to build it fro ```bash git clone https://github.com/huggingface/tei-gaudi cd tei-gaudi/ -docker build --no-cache -f Dockerfile-hpu -t opea/tei-gaudi:latest . +docker build --no-cache -f Dockerfile-hpu -t opea/tei-gaudi:${TAG} . cd .. ``` @@ -183,12 +186,12 @@ Build the megaservice image for this use case ``` cd .. cd GenAIExamples -git checkout tags/v1.0 +git checkout tags/v1.1 cd ChatQnA ``` ```bash -docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . +docker build --no-cache -t opea/chatqna:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . cd ../.. ``` @@ -198,7 +201,7 @@ If you want to enable guardrails microservice in the pipeline, please use the be ```bash cd GenAIExamples/ChatQnA/ -docker build --no-cache -t opea/chatqna-guardrails:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile.guardrails . +docker build --no-cache -t opea/chatqna-guardrails:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile.guardrails . cd ../.. ``` @@ -210,7 +213,7 @@ As mentioned, you can build 2 modes of UI ```bash cd GenAIExamples/ChatQnA/ui/ -docker build --no-cache -t opea/chatqna-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile . +docker build --no-cache -t opea/chatqna-ui:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile . cd ../../.. ``` @@ -219,43 +222,47 @@ If you want a conversational experience with chatqna megaservice. ```bash cd GenAIExamples/ChatQnA/ui/ -docker build --no-cache -t opea/chatqna-conversation-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile.react . +docker build --no-cache -t opea/chatqna-conversation-ui:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile.react . cd ../../.. ``` ### Sanity Check -Check if you have the below set of docker images, before moving on to the next step: +Check if you have the below set of docker images before moving on to the next step. The tags are +based on what you set the environment variable `TAG` to. ::::{tab-set} :::{tab-item} vllm :sync: vllm -* opea/dataprep-redis:latest -* opea/embedding-tei:latest -* opea/retriever-redis:latest -* opea/reranking-tei:latest -* opea/tei-gaudi:latest -* opea/chatqna:latest or opea/chatqna-guardrails:latest -* opea/chatqna:latest -* opea/chatqna-ui:latest -* opea/vllm:latest -* opea/llm-vllm:latest +* opea/dataprep-redis:${TAG} +* opea/embedding-tei:${TAG} +* opea/retriever-redis:${TAG} +* opea/reranking-tei:${TAG} +* opea/tei-gaudi:${TAG} +* opea/chatqna:${TAG} or opea/chatqna-guardrails:${TAG} +* opea/chatqna:${TAG} +* opea/chatqna-ui:${TAG} +* opea/vllm:${TAG} +* opea/llm-vllm:${TAG} ::: :::{tab-item} TGI :sync: TGI -* opea/dataprep-redis:latest -* opea/embedding-tei:latest -* opea/retriever-redis:latest -* opea/reranking-tei:latest -* opea/tei-gaudi:latest -* opea/chatqna:latest or opea/chatqna-guardrails:latest -* opea/chatqna-ui:latest -* opea/llm-tgi:latest +* opea/dataprep-redis:${TAG} +* opea/embedding-tei:${TAG} +* opea/retriever-redis:${TAG} +* opea/reranking-tei:${TAG} +* opea/tei-gaudi:${TAG} +* opea/chatqna:${TAG} or opea/chatqna-guardrails:${TAG} +* opea/chatqna-ui:${TAG} +* opea/llm-tgi:${TAG} ::: :::: +::::: +:::::: + ## Use Case Setup As mentioned the use case will use the following combination of the GenAIComps @@ -392,7 +399,8 @@ Check if all the containers launched via docker compose has started For example, the ChatQnA example starts 11 docker (services), check these docker containers are all running, i.e, all the containers `STATUS` are `Up` -To do a quick sanity check, try `docker ps -a` to see if all the containers are running +To do a quick sanity check, try `docker ps -a` to see if all the containers are running. +Note that `TAG` will be the value you set earlier. ::::{tab-set} @@ -400,15 +408,15 @@ To do a quick sanity check, try `docker ps -a` to see if all the containers are :sync: vllm ```bash CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES -42c8d5ec67e9 opea/chatqna-ui:latest "docker-entrypoint.s…" About a minute ago Up About a minute 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp chatqna-gaudi-ui-server -7f7037a75f8b opea/chatqna:latest "python chatqna.py" About a minute ago Up About a minute 0.0.0.0:8888->8888/tcp, :::8888->8888/tcp chatqna-gaudi-backend-server -4049c181da93 opea/embedding-tei:latest "python embedding_te…" About a minute ago Up About a minute 0.0.0.0:6000->6000/tcp, :::6000->6000/tcp embedding-tei-server -171816f0a789 opea/dataprep-redis:latest "python prepare_doc_…" About a minute ago Up About a minute 0.0.0.0:6007->6007/tcp, :::6007->6007/tcp dataprep-redis-server -10ee6dec7d37 opea/llm-vllm:latest "bash entrypoint.sh" About a minute ago Up About a minute 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp llm-vllm-gaudi-server -ce4e7802a371 opea/retriever-redis:latest "python retriever_re…" About a minute ago Up About a minute 0.0.0.0:7000->7000/tcp, :::7000->7000/tcp retriever-redis-server -be6cd2d0ea38 opea/reranking-tei:latest "python reranking_te…" About a minute ago Up About a minute 0.0.0.0:8000->8000/tcp, :::8000->8000/tcp reranking-tei-gaudi-server -cc45ff032e8c opea/tei-gaudi:latest "text-embeddings-rou…" About a minute ago Up About a minute 0.0.0.0:8090->80/tcp, :::8090->80/tcp tei-embedding-gaudi-server -4969ec3aea02 opea/llm-vllm-hpu:latest "/bin/bash -c 'expor…" About a minute ago Up About a minute 0.0.0.0:8007->80/tcp, :::8007->80/tcp vllm-gaudi-server +42c8d5ec67e9 opea/chatqna-ui:${TAG} "docker-entrypoint.s…" About a minute ago Up About a minute 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp chatqna-gaudi-ui-server +7f7037a75f8b opea/chatqna:${TAG} "python chatqna.py" About a minute ago Up About a minute 0.0.0.0:8888->8888/tcp, :::8888->8888/tcp chatqna-gaudi-backend-server +4049c181da93 opea/embedding-tei:${TAG} "python embedding_te…" About a minute ago Up About a minute 0.0.0.0:6000->6000/tcp, :::6000->6000/tcp embedding-tei-server +171816f0a789 opea/dataprep-redis:${TAG} "python prepare_doc_…" About a minute ago Up About a minute 0.0.0.0:6007->6007/tcp, :::6007->6007/tcp dataprep-redis-server +10ee6dec7d37 opea/llm-vllm:${TAG} "bash entrypoint.sh" About a minute ago Up About a minute 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp llm-vllm-gaudi-server +ce4e7802a371 opea/retriever-redis:${TAG} "python retriever_re…" About a minute ago Up About a minute 0.0.0.0:7000->7000/tcp, :::7000->7000/tcp retriever-redis-server +be6cd2d0ea38 opea/reranking-tei:${TAG} "python reranking_te…" About a minute ago Up About a minute 0.0.0.0:8000->8000/tcp, :::8000->8000/tcp reranking-tei-gaudi-server +cc45ff032e8c opea/tei-gaudi:${TAG} "text-embeddings-rou…" About a minute ago Up About a minute 0.0.0.0:8090->80/tcp, :::8090->80/tcp tei-embedding-gaudi-server +4969ec3aea02 opea/llm-vllm-hpu:${TAG} "/bin/bash -c 'expor…" About a minute ago Up About a minute 0.0.0.0:8007->80/tcp, :::8007->80/tcp vllm-gaudi-server 0657cb66df78 redis/redis-stack:7.2.0-v9 "/entrypoint.sh" About a minute ago Up About a minute 0.0.0.0:6379->6379/tcp, :::6379->6379/tcp, 0.0.0.0:8001->8001/tcp, :::8001->8001/tcp redis-vector-db 684d3e9d204a ghcr.io/huggingface/text-embeddings-inference:cpu-1.2 "text-embeddings-rou…" About a minute ago Up About a minute 0.0.0.0:8808->80/tcp, :::8808->80/tcp tei-reranking-gaudi-server ``` @@ -418,15 +426,15 @@ cc45ff032e8c opea/tei-gaudi:latest "text-emb :sync: TGI ```bash CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES -0355d705484a opea/chatqna-ui:latest "docker-entrypoint.s…" 2 minutes ago Up 2 minutes 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp chatqna-gaudi-ui-server -29a7a43abcef opea/chatqna:latest "python chatqna.py" 2 minutes ago Up 2 minutes 0.0.0.0:8888->8888/tcp, :::8888->8888/tcp chatqna-gaudi-backend-server -1eb6f5ad6f85 opea/llm-tgi:latest "bash entrypoint.sh" 2 minutes ago Up 2 minutes 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp llm-tgi-gaudi-server -ad27729caf68 opea/reranking-tei:latest "python reranking_te…" 2 minutes ago Up 2 minutes 0.0.0.0:8000->8000/tcp, :::8000->8000/tcp reranking-tei-gaudi-server -84f02cf2a904 opea/dataprep-redis:latest "python prepare_doc_…" 2 minutes ago Up 2 minutes 0.0.0.0:6007->6007/tcp, :::6007->6007/tcp dataprep-redis-server -367459f6e65b opea/embedding-tei:latest "python embedding_te…" 2 minutes ago Up 2 minutes 0.0.0.0:6000->6000/tcp, :::6000->6000/tcp embedding-tei-server -8c78cde9f588 opea/retriever-redis:latest "python retriever_re…" 2 minutes ago Up 2 minutes 0.0.0.0:7000->7000/tcp, :::7000->7000/tcp retriever-redis-server +0355d705484a opea/chatqna-ui:${TAG} "docker-entrypoint.s…" 2 minutes ago Up 2 minutes 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp chatqna-gaudi-ui-server +29a7a43abcef opea/chatqna:${TAG} "python chatqna.py" 2 minutes ago Up 2 minutes 0.0.0.0:8888->8888/tcp, :::8888->8888/tcp chatqna-gaudi-backend-server +1eb6f5ad6f85 opea/llm-tgi:${TAG} "bash entrypoint.sh" 2 minutes ago Up 2 minutes 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp llm-tgi-gaudi-server +ad27729caf68 opea/reranking-tei:${TAG} "python reranking_te…" 2 minutes ago Up 2 minutes 0.0.0.0:8000->8000/tcp, :::8000->8000/tcp reranking-tei-gaudi-server +84f02cf2a904 opea/dataprep-redis:${TAG} "python prepare_doc_…" 2 minutes ago Up 2 minutes 0.0.0.0:6007->6007/tcp, :::6007->6007/tcp dataprep-redis-server +367459f6e65b opea/embedding-tei:${TAG} "python embedding_te…" 2 minutes ago Up 2 minutes 0.0.0.0:6000->6000/tcp, :::6000->6000/tcp embedding-tei-server +8c78cde9f588 opea/retriever-redis:${TAG} "python retriever_re…" 2 minutes ago Up 2 minutes 0.0.0.0:7000->7000/tcp, :::7000->7000/tcp retriever-redis-server fa80772de92c ghcr.io/huggingface/tgi-gaudi:2.0.1 "text-generation-lau…" 2 minutes ago Up 2 minutes 0.0.0.0:8005->80/tcp, :::8005->80/tcp tgi-gaudi-server -581687a2cc1a opea/tei-gaudi:latest "text-embeddings-rou…" 2 minutes ago Up 2 minutes 0.0.0.0:8090->80/tcp, :::8090->80/tcp tei-embedding-gaudi-server +581687a2cc1a opea/tei-gaudi:${TAG} "text-embeddings-rou…" 2 minutes ago Up 2 minutes 0.0.0.0:8090->80/tcp, :::8090->80/tcp tei-embedding-gaudi-server c59178629901 redis/redis-stack:7.2.0-v9 "/entrypoint.sh" 2 minutes ago Up 2 minutes 0.0.0.0:6379->6379/tcp, :::6379->6379/tcp, 0.0.0.0:8001->8001/tcp, :::8001->8001/tcp redis-vector-db 5c3a78144498 ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 "text-embeddings-rou…" 2 minutes ago Up 2 minutes 0.0.0.0:8808->80/tcp, :::8808->80/tcp tei-reranking-gaudi-server ``` @@ -796,7 +804,7 @@ curl http://${host_ip}:9090/v1/guardrails\ To access the frontend, open the following URL in your browser: http://{host_ip}:5173. By default, the UI runs on port 5173 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the compose.yaml file as shown below: ```bash chaqna-gaudi-ui-server: - image: opea/chatqna-ui:latest + image: opea/chatqna-ui:${TAG} ... ports: - "80:5173" @@ -806,7 +814,7 @@ To access the frontend, open the following URL in your browser: http://{host_ip} To access the Conversational UI (react based) frontend, modify the UI service in the compose.yaml file. Replace chaqna-gaudi-ui-server service with the chatqna-gaudi-conversation-ui-server service as per the config below: ```bash chaqna-gaudi-conversation-ui-server: - image: opea/chatqna-conversation-ui:latest + image: opea/chatqna-conversation-ui:${TAG} container_name: chatqna-gaudi-conversation-ui-server environment: - APP_BACKEND_SERVICE_ENDPOINT=${BACKEND_SERVICE_ENDPOINT} @@ -821,7 +829,7 @@ chaqna-gaudi-conversation-ui-server: Once the services are up, open the following URL in your browser: http://{host_ip}:5174. By default, the UI runs on port 80 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the compose.yaml file as shown below: ``` chaqna-gaudi-conversation-ui-server: - image: opea/chatqna-conversation-ui:latest + image: opea/chatqna-conversation-ui:${TAG} ... ports: - "80:80" @@ -954,4 +962,4 @@ docker compose -f compose_vllm.yaml down docker compose -f compose.yaml down ``` ::: -:::: +:::: \ No newline at end of file