Skip to content

Commit

Permalink
add k8s mainfest and add to getting started
Browse files Browse the repository at this point in the history
Signed-off-by: devpramod <[email protected]>
  • Loading branch information
devpramod committed Nov 19, 2024
1 parent 12e2e4f commit de87a0a
Show file tree
Hide file tree
Showing 4 changed files with 504 additions and 18 deletions.
4 changes: 1 addition & 3 deletions examples/ChatQnA/deploy/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,7 @@ Kubernetes

K8s Getting Started <k8s_getting_started>
TGI on Xeon with Helm Charts <k8s_helm>

* Xeon & Gaudi with GMC
* Xeon & Gaudi without GMC
TGI on Xeon with Kubernetes Manifest <k8s_manifest>

Cloud Native
************
Expand Down
22 changes: 18 additions & 4 deletions examples/ChatQnA/deploy/k8s_getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,11 +58,13 @@ kubectl config set-context --current --namespace=chatqa

**What is Helm?** Helm is a package manager for Kubernetes, similar to how apt is for Ubuntu. It simplifies deploying and managing Kubernetes applications through Helm charts, which are packages of pre-configured Kubernetes resources.

**Key Components of a Helm Chart:**
#### Key Components of a Helm Chart

- **Chart.yaml**: This file contains metadata about the chart such as name, version, and description.
- **values.yaml**: Stores configuration values that can be customized depending on the deployment environment. These values override defaults set in the chart templates.
- **deployment.yaml**: Part of the templates directory, this file describes how the Kubernetes resources should be deployed, such as Pods and Services.
| Component |Description |
| --- | --- |
| `Chart.yaml` | This file contains metadata about the chart such as name, version, and description. |
| `values.yaml` | Stores configuration values that can be customized depending on the deployment environment. These values override defaults set in the chart templates. |
| `deployment.yaml` | Part of the templates directory, this file describes how the Kubernetes resources should be deployed, such as Pods and Services. |

**Update Dependencies:**

Expand All @@ -74,3 +76,15 @@ kubectl config set-context --current --namespace=chatqa
- `helm install [RELEASE_NAME] [CHART_NAME]`: This command deploys a Helm chart into your Kubernetes cluster, creating a new release. It is used to set up all the Kubernetes resources specified in the chart and track the version of the deployment.

For more detailed instructions and explanations, you can refer to the [official Helm documentation](https://helm.sh/docs/).

### Using Kubernetes Manifest to Deploy
Manifest files in YAML format define the Kubernetes resources you want to manage. The main components in a manifest file include:

- **ConfigMap**: Stores configuration data that can be used by pods, allowing you to keep containerized applications portable without embedding configuration data directly within the application's images. For example, a ConfigMap might store the database URL and credentials that your application needs to connect to a database.

- **Services**: Defines a logical set of Pods and a policy by which to access them. This resource abstracts the way you expose an application running on a set of Pods as a network service.

- **Deployment**: Manages the state of replicated application instances. It automatically replaces instances that fail or are deleted, maintaining the desired state of the application.


For more detailed examples, you can view the [ChatQnA manifest file](https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/kubernetes/intel/cpu/xeon/manifest/chatqna.yaml) which includes definitions for services, deployments, and other resources essential for running the ChatQnA application. This file is a reference for understanding how Kubernetes resources for ChatQnA are defined and orchestrated.
16 changes: 5 additions & 11 deletions examples/ChatQnA/deploy/k8s_helm.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,6 @@
# Multi-node on-prem deployment with TGI on Xeon Scalable processors on a K8s cluster using Helm Charts
# Multi-node on-prem deployment with TGI on Xeon Scalable processors on a K8s cluster using Helm

This deployment section covers multi-node on-prem deployment of the ChatQnA
example with OPEA comps to deploy using the TGI service. There are several
slice-n-dice ways to enable RAG with vectordb and LLM models, but here we will
be covering one option of doing it for convenience : we will be showcasing how
to build an e2e chatQnA with Redis VectorDB and neural-chat-7b-v3-3 model,
deployed on a Kubernetes cluster using Helm. For more information on how to setup a Xeon based Kubernetes cluster along with the development pre-requisites,
please follow the instructions here (*** ### Kubernetes Cluster and Development Environment***).
For a quick introduction on Helm Charts, visit the helm section in [Getting Started with Kubernetes for ChatQnA](./k8s_getting_started.md)
This deployment section covers multi-node on-prem deployment of the ChatQnA example with OPEA comps to deploy using the TGI service. There are several slice-n-dice ways to enable RAG with vectordb and LLM models, but here we will be covering one option of doing it for convenience: we will be showcasing how to build an e2e chatQnA with Redis VectorDB and neural-chat-7b-v3-3 model, deployed on a Kubernetes cluster using Helm. For more information on how to setup a Xeon based Kubernetes cluster along with the development pre-requisites, follow the instructions here [Kubernetes Cluster and Development Environment](./k8s_getting_started.md#kubernetes-cluster-and-development-environment). For a quick introduction on Helm Charts, visit the helm section in [Getting Started with Kubernetes for ChatQnA](./k8s_getting_started.md).

## Overview

Expand Down Expand Up @@ -61,6 +54,7 @@ vi chatqna/values.yaml
```
Update the following section and save file:
```yaml
# chatqna/values.yaml
global:
http_proxy: "http://your-proxy-address:port"
https_proxy: "http://your-proxy-address:port"
Expand Down Expand Up @@ -166,11 +160,11 @@ chatqna-tgi-675c4d79f6-cf4pq 1/1 Running 0
When issues are encountered with a pod in the Kubernetes deployment, there are two primary commands to diagnose and potentially resolve problems:
1. **Checking Logs**: To view the logs of a specific pod, which can provide insight into what the application is doing and any errors it might be encountering, use:
```bash
kubectl logs [pod-name]
kubectl logs <pod-name>
```
2. **Describing Pods**: For a detailed view of the pod's current state, its configuration, and its operational events, run:
```bash
kubectl describe pod [pod-name]
kubectl describe pod <pod-name>
```
For example, if the status of the TGI service does not show 'Running', describe the pod using the name from the above table:
```bash
Expand Down
Loading

0 comments on commit de87a0a

Please sign in to comment.