Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TDX] Added basic documentation to enable TDX in ChatQnA #1212

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions ChatQnA/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,10 @@ docker compose up -d

Refer to the [NVIDIA GPU Guide](./docker_compose/nvidia/gpu/README.md) for more instructions on building docker images from source.

### Deploy ChatQnA into Kubernetes on Xeon with Intel TDX protection

Refer to the [Kubernetes Guide](./kubernetes/intel/README_tdx.md) for instructions on deploying ChatQnA into Kubernetes on Xeon with services protected using Intel TDX.

### Deploy ChatQnA into Kubernetes on Xeon & Gaudi with GMC

Refer to the [Kubernetes Guide](./kubernetes/intel/README_gmc.md) for instructions on deploying ChatQnA into Kubernetes on Xeon & Gaudi with GMC.
Expand Down
136 changes: 136 additions & 0 deletions ChatQnA/kubernetes/intel/README_tdx.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# Deploy example application in Kubernetes Cluster on Xeon with Intel TDX

This document outlines the deployment process for an example application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline components on Intel Xeon server where the microservices are protected by [Intel TDX](https://www.intel.com/content/www/us/en/developer/tools/trust-domain-extensions/overview.html).

The deployment process is intended for users who want to deploy an example application:

- with pods protected by Intel TDX,
- on a single node in a cluster (acting as a master and worker) that is a Xeon 4th Gen platform or later,
- running Ubuntu 24.04,
- using images pushed to public repository, like quay.io or docker hub.


## Getting Started

Follow the below steps on the Xeon server node to deploy the example application:

1. [Install Ubuntu 24.04 and enable Intel TDX](https://github.com/canonical/tdx/blob/noble-24.04/README.md#setup-host-os)
2. [Install Kubernetes cluster](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/)
3. [Install Confidential Containers Operator](https://cc-enabling.trustedservices.intel.com/intel-confidential-containers-guide/02/infrastructure_setup/#install-confidential-containers-operator)
4. Increase the kubelet timeout:

```bash
sudo sed -i 's/runtimeRequestTimeout: .*/runtimeRequestTimeout: 30m/' "/var/lib/kubelet/config.yaml"
sudo systemctl daemon-reload && sudo systemctl restart kubelet
```

5. Change directory:

```bash
cd GenAIExamples/ChatQnA/kubernetes/intel/cpu/xeon/manifest
```

6. Deploy ChatQnA:

```bash
kubectl apply -f chatqna_tdx.yaml
```

7. Verify all pods are running:

```bash
kubectl get pods
```


## Advanced configuration

To protect a single component with Intel TDX, user must modify its manifest file.
The details are described in the [Demo Workload Deployment](https://cc-enabling.trustedservices.intel.com/intel-confidential-containers-guide/03/demo_workload_deployment/#pod-isolated-by-kata-containers-and-protected-by-intel-tdx).

Here, we describe the required changes on the example Deployment definition below:

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: llm-uservice
# (...)
spec:
selector:
matchLabels:
app.kubernetes.io/name: llm-uservice
app.kubernetes.io/instance: llm-uservice
# (...)
template:
metadata:
# (...)
annotations:
io.katacontainers.config.runtime.create_container_timeout: "600" # <<--- increase the timeout for container creation
spec:
runtimeClassName: kata-qemu-tdx # <<--- this is required to start the pod in Trust Domain (TD, virtual machine protected with Intel TDX)
containers:
- name: llm-uservice
# (...)
resources: # <<--- specify resources enough to run the service efficiently (memory must be at least 2x the image size)
limits:
cpu: "4"
memory: 4Gi
requests:
cpu: "4"
memory: 4Gi
```


### Customization of deployment configuration

If you want to have more control over what is protected with Intel TDX or use a different deployment file, you can manually modify the deployment configuration, by following steps below:

1. Change directory:

```bash
cd GenAIExamples/ChatQnA/kubernetes/intel/cpu/xeon/manifest
```

2. Define the services you want to protect with Intel TDX:

```bash
SERVICES=("llm-uservice")
```

3. Define the pipeline you want to deploy:

```bash
FILE=chatqna.yaml
```

4. Run the script to add `runtimeClassName` and required annotation only to the chosen `SERVICES` in the `FILE` you defined above:

```bash
for SERVICE in "${SERVICES[@]}"; do
yq eval '
(select(.kind == "Deployment" and .metadata.name == "'"$SERVICE"'") | .spec.template.metadata.annotations."io.katacontainers.config.runtime.create_container_timeout") = "800"
' "$FILE" -i;
yq eval '
(select(.kind == "Deployment" and .metadata.name == "'"$SERVICE"'") | .spec.template.spec.runtimeClassName) = "kata-qemu-tdx"
' "$FILE" -i;
done
```

5. For each service from `SERVICES`, edit the deployment `FILE` to define the resources that must be assigned to the pod to run the service efficiently.
The `memory` must be at least 2x the image size.
By default, the pod will be assigned `1 CPU` and `2048 MiB` of memory, but half of it will be used for filesystem.

6. Apply the changes to the deployment configuration:

```bash
kubectl apply -f $FILE
```

> [!IMPORTANT]
> Total amount of resources assigned to all TDX-protected pods must be less than the total amount of resources available on the node, leaving room for the non-TDX pods requests.


## Troubleshoting

In case of any problems regarding pod creation, refer to [Troubleshooting guide](https://cc-enabling.trustedservices.intel.com/intel-confidential-containers-guide/04/troubleshooting/).
Loading
Loading