diff --git a/CodeGen/openshift-rhoai/manifests/README.md b/CodeGen/openshift-rhoai/manifests/README.md
new file mode 100644
index 000000000..bed6517f4
--- /dev/null
+++ b/CodeGen/openshift-rhoai/manifests/README.md
@@ -0,0 +1,251 @@
+
Deploy CodeGen on OpenShift Cluster with RHOAI
+
+## Prerequisites
+
+1. **Red Hat OpenShift Cluster** with dynamic _StorageClass_ to provision _PersistentVolumes_ e.g. **OpenShift Data Foundation**) and installed Operators: **Red Hat - Authorino (Technical Preview)**, **Red Hat OpenShift Service Mesh**, **Red Hat OpenShift Serverless** and **Red Hat Openshift AI**.
+2. Exposed image registry to push there docker images (https://docs.openshift.com/container-platform/4.16/registry/securing-exposing-registry.html).
+3. Access to S3-compatible object storage bucket (e.g. **OpenShift Data Foundation**, **AWS S3**) and values of access and secret access keys and S3 endpoint (https://docs.redhat.com/en/documentation/red_hat_openshift_data_foundation/4.16/html/managing_hybrid_and_multicloud_resources/accessing-the-multicloud-object-gateway-with-your-applications_rhodf#accessing-the-multicloud-object-gateway-with-your-applications_rhodf)\
+4. Account on https://huggingface.co/, access to model _ise-uiuc/Magicoder-S-DS-6.7B_ (for Xeon) or _meta-llama/CodeLlama-7b-hf_ (for Gaudi) and token with _Read permissions_. Update the access token in your repository using following commands.
+
+On Xeon:
+
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
+export HFTOKEN="YourOwnToken"
+sed -i "s/insert-your-huggingface-token-here/${HFTOKEN}/g" codegen.yaml servingruntime-magicoder.yaml
+```
+
+On Gaudi:
+
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
+export HFTOKEN="YourOwnToken"
+sed -i "s/insert-your-huggingface-token-here/${HFTOKEN}/g" codegen.yaml servingruntime-codellama.yaml
+```
+
+## Deploy model in Red Hat Openshift AI
+
+1. Log in to OpenShift CLI and run following commands to create new serving runtime.
+
+On Xeon:
+
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
+oc apply -f servingruntime-magicoder.yaml
+```
+
+On Gaudi:
+
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
+oc apply -f servingruntime-codellama.yaml
+```
+
+Verify if template has been created with `oc get template -n redhat-ods-applications` command.
+
+2. Find the route for **Red Hat OpenShift AI** dashboard with below command and open it in the browser:
+
+```
+oc get routes -A | grep rhods-dashboard
+```
+
+3. Go to **Data Science Project** and click **Create data science project**. Fill the **Name** and click **Create**.
+4. Go to **Workbenches** tab and click **Create workbench**. Fill the **Name**, under **Notebook image** choose _Standard Data Science_, under **Cluster storage** choose _Create new persistent storage_ and change **Persistent storage size** to 40 GB. Click **Create workbench**.
+5. Open newly created Jupiter notebook and run following commands to download the model and upload it on s3:
+
+```
+%env S3_ENDPOINT=
+%env S3_ACCESS_KEY=
+%env S3_SECRET_KEY=
+%env HF_TOKEN=
+```
+
+```
+!pip install huggingface-hub
+```
+
+```
+import os
+import boto3
+import botocore
+import glob
+from huggingface_hub import snapshot_download
+bucket_name = 'first.bucket'
+s3_endpoint = os.environ.get('S3_ENDPOINT')
+s3_accesskey = os.environ.get('S3_ACCESS_KEY')
+s3_secretkey = os.environ.get('S3_SECRET_KEY')
+path = 'models'
+hf_token = os.environ.get('HF_TOKEN')
+session = boto3.session.Session()
+s3_resource = session.resource('s3',
+ endpoint_url=s3_endpoint,
+ verify=False,
+ aws_access_key_id=s3_accesskey,
+ aws_secret_access_key=s3_secretkey)
+bucket = s3_resource.Bucket(bucket_name)
+```
+
+For Xeon download _ise-uiuc/Magicoder-S-DS-6.7B_:
+
+```
+snapshot_download("ise-uiuc/Magicoder-S-DS-6.7B", cache_dir=f'./models', token=hf_token)
+```
+
+For Gaudi download _meta-llama/CodeLlama-7b-hf_:
+
+```
+snapshot_download("meta-llama/CodeLlama-7b-hf", cache_dir=f'./models', token=hf_token)
+```
+
+Upload the downloaded model to S3:
+
+```
+files = (file for file in glob.glob(f'{path}/**/*', recursive=True) if os.path.isfile(file) and "snapshots" in file)
+for filename in files:
+ s3_name = filename.replace(path, '')
+ print(f'Uploading: {filename} to {path}{s3_name}')
+ bucket.upload_file(filename, f'{path}{s3_name}')
+```
+
+6. Go to your project in **Red Hat OpenShift AI** dashboard, then "Models" tab and click **Deploy model** under _Single-model serving platform_. Fill the **Name**, choose newly created **Serving runtime**: _Text Generation Inference Magicoder-S-DS-6.7B on CPU_ (for Xeon) or _Text Generation Inference CodeLlama-7b-hf on Gaudi_ (for Gaudi), **Model framework**: _llm_ and change **Model server size** to _Custom_: 16 CPUs and 64 Gi memory. For deployment with Gaudi select proper **Accelerator**. Click the checkbox to create external route in **Model route** section and uncheck the token authentication. Under **Model location** choose _New data connection_ and fill all required fields for s3 access, **Bucket** _first.bucket_ and **Path**: _models_. Click **Deploy**. It takes about 10 minutes to get _Loaded_ status.\
+ If it's not going to _Loaded_ status and revision changed status to "ProgressDeadlineExceeded" (`oc get revision`), scale model deployment to 0 and than to 1 with command `oc scale deployment.apps/ --replicas=1` and wait about 10 minutes for deployment.
+
+## Deploy CodeGen
+
+1. Login to OpenShift CLI, go to your project and find the URL of TGI_LLM_ENDPOINT:
+
+```
+oc get service.serving.knative.dev
+```
+
+Update the TGI_LLM_ENDPOINT in your repository.
+
+On Xeon:
+
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
+export TGI_LLM_ENDPOINT="YourURL"
+sed -i "s#insert-your-tgi-url-here#${TGI_LLM_ENDPOINT}#g" codegen.yaml
+```
+
+On Gaudi:
+
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
+export TGI_LLM_ENDPOINT="YourURL"
+sed -i "s#insert-your-tgi-url-here#${TGI_LLM_ENDPOINT}#g" codegen.yaml
+```
+
+2. Build docker images locally
+
+- LLM Docker Image:
+
+```
+git clone https://github.com/opea-project/GenAIComps.git
+cd GenAIComps
+docker build -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile .
+```
+
+- MegaService Docker Image:
+
+```
+git clone https://github.com/opea-project/GenAIExamples
+cd GenAIExamples/CodeGen
+docker build -t opea/codegen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
+```
+
+- UI Docker Image:
+
+```
+cd GenAIExamples/CodeGen/ui
+docker build -t opea/codegen-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
+```
+
+To verify run the command: `docker images`.
+
+3. Login to docker, tag the images and push it to image registry with following commands:
+
+```
+docker login -u -p $(oc whoami -t)
+docker tag //:
+docker push //:
+```
+
+To verify run the command: `oc get istag`.
+
+4. Use the _IMAGE REFERENCE_ from previous step to update images names in manifest files.
+
+On Xeon:
+
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
+export IMAGE_LLM_TGI="YourImage"
+export IMAGE_CODEGEN="YourImage"
+export IMAGE_CODEGEN_UI="YourImage"
+sed -i "s#insert-your-image-llm-tgi#${IMAGE_LLM_TGI}#g" codegen.yaml
+sed -i "s#insert-your-image-codegen#${IMAGE_CODEGEN}#g" codegen.yaml
+sed -i "s#insert-your-image-codegen-ui#${IMAGE_CODEGEN_UI}#g" ui-server.yaml
+```
+
+On Gaudi:
+
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
+export IMAGE_LLM_TGI="YourImage"
+export IMAGE_CODEGEN="YourImage"
+export IMAGE_CODEGEN_UI="YourImage"
+sed -i "s#insert-your-image-llm-tgi#${IMAGE_LLM_TGI}#g" codegen.yaml
+sed -i "s#insert-your-image-codegen#${IMAGE_CODEGEN}#g" codegen.yaml
+sed -i "s#insert-your-image-codegen-ui#${IMAGE_CODEGEN_UI}#g" ui-server.yaml
+```
+
+5. Create _rhoai-ca-bundle_ secret:
+
+```
+oc create secret generic rhoai-ca-bundle --from-literal=tls.crt="$(oc extract secret/knative-serving-cert -n istio-system --to=- --keys=tls.crt)"
+```
+
+6. Deploy CodeGen with command:
+
+```
+oc apply -f codegen.yaml
+```
+
+7. Check the _codegen_ route with command `oc get routes` and update the route in _ui-server.yaml_ file:
+
+On Xeon:
+
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/xeon
+export CODEGEN_ROUTE="YourCodegenRoute"
+sed -i "s/insert-your-codegen-route/${CODEGEN_ROUTE}/g" ui-server.yaml
+```
+
+On Gaudi:
+
+```
+cd GenAIExamples/CodeGen/openshift-rhoai/manifests/gaudi
+export CODEGEN_ROUTE="YourCodegenRoute"
+sed -i "s/insert-your-codegen-route/${CODEGEN_ROUTE}/g" ui-server.yaml
+```
+
+8. Deploy UI with command:
+
+```
+oc apply -f ui-server.yaml
+```
+
+## Verify Services
+
+Make sure all the pods are running, and restart the codegen-xxxx pod if necessary.
+
+```
+oc get pods
+curl http://${CODEGEN_ROUTE}/v1/codegen -H "Content-Type: application/json" -d '{
+ "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."
+ }'
+```
+
+## Launch the UI
+
+To access the frontend, find the route for _ui-server_ with command `oc get routes` and open it in the browser.
diff --git a/CodeGen/openshift-rhoai/manifests/gaudi/codegen.yaml b/CodeGen/openshift-rhoai/manifests/gaudi/codegen.yaml
new file mode 100644
index 000000000..5b8c5a413
--- /dev/null
+++ b/CodeGen/openshift-rhoai/manifests/gaudi/codegen.yaml
@@ -0,0 +1,167 @@
+---
+# Source: codegen/charts/llm-uservice/charts/tgi/templates/service.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+ name: codegen-llm-uservice
+ labels:
+ helm.sh/chart: llm-uservice-0.1.0
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.0.0"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ type: ClusterIP
+ ports:
+ - port: 9000
+ targetPort: 9000
+ protocol: TCP
+ name: llm-uservice
+ selector:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+---
+apiVersion: v1
+kind: Service
+metadata:
+ name: codegen
+ labels:
+ helm.sh/chart: codegen-0.1.0
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.0.0"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ type: ClusterIP
+ ports:
+ - port: 7778
+ targetPort: 7778
+ protocol: TCP
+ name: codegen
+ selector:
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: codegen-llm-uservice
+ labels:
+ helm.sh/chart: llm-uservice-0.1.0
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.0.0"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+ spec:
+ securityContext: {}
+ containers:
+ - name: codegen
+ command:
+ - /bin/bash
+ - -c
+ - |
+ cp /usr/lib/ssl/cert.pem /tmp/bundle.crt && \
+ cat /rhoai-ca/tls.crt | tee -a '/tmp/bundle.crt' && \
+ bash ./entrypoint.sh
+ env:
+ - name: TGI_LLM_ENDPOINT
+ value: "insert-your-tgi-url-here"
+ - name: HUGGINGFACEHUB_API_TOKEN
+ value: "insert-your-huggingface-token-here"
+ - name: PYTHONPATH
+ value: /home/user/.local/lib/python3.11/site-packages:/home/user
+ - name: HOME
+ value: /tmp/home
+ - name: SSL_CERT_FILE
+ value: /tmp/bundle.crt
+ securityContext: {}
+ image: "insert-your-image-llm-tgi"
+ imagePullPolicy: IfNotPresent
+ ports:
+ - name: llm-uservice
+ containerPort: 9000
+ protocol: TCP
+ volumeMounts:
+ - mountPath: /tmp/home
+ name: local-dir
+ - mountPath: /rhoai-ca
+ name: odh-ca-bundle
+ resources: {}
+ volumes:
+ - emptyDir:
+ sizeLimit: 5Gi
+ name: local-dir
+ - name: odh-ca-bundle
+ secret:
+ defaultMode: 420
+ secretName: rhoai-ca-bundle
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: codegen
+ labels:
+ helm.sh/chart: codegen-0.1.0
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.0.0"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+ spec:
+ securityContext: null
+ containers:
+ - name: codegen
+ env:
+ - name: LLM_SERVICE_HOST_IP
+ value: codegen-llm-uservice
+ securityContext: null
+ image: "insert-your-image-codegen"
+ imagePullPolicy: IfNotPresent
+ ports:
+ - name: codegen
+ containerPort: 7778
+ protocol: TCP
+ resources: null
+---
+apiVersion: route.openshift.io/v1
+kind: Route
+metadata:
+ labels:
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/managed-by: Helm
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/version: 1.0.0
+ helm.sh/chart: codegen-0.1.0
+ name: codegen
+spec:
+ port:
+ targetPort: codegen
+ to:
+ kind: Service
+ name: codegen
+ weight: 100
+ wildcardPolicy: None
diff --git a/CodeGen/openshift-rhoai/manifests/gaudi/servingruntime-codellama.yaml b/CodeGen/openshift-rhoai/manifests/gaudi/servingruntime-codellama.yaml
new file mode 100644
index 000000000..95573101f
--- /dev/null
+++ b/CodeGen/openshift-rhoai/manifests/gaudi/servingruntime-codellama.yaml
@@ -0,0 +1,70 @@
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+---
+apiVersion: template.openshift.io/v1
+kind: Template
+metadata:
+ annotations:
+ opendatahub.io/apiProtocol: REST
+ opendatahub.io/modelServingSupport: '["single"]'
+ labels:
+ opendatahub.io/dashboard: "true"
+ name: tgi-codellama-7b-hf-gaudi
+ namespace: redhat-ods-applications
+objects:
+- apiVersion: serving.kserve.io/v1alpha1
+ kind: ServingRuntime
+ labels:
+ opendatahub.io/dashboard: "true"
+ metadata:
+ annotations:
+ openshift.io/display-name: Text Generation Inference CodeLlama-7b-hf on Gaudi
+ opendatahub.io/recommended-accelerators: '["habana.ai/gaudi"]'
+ name: tgi-codellama-7b-hf-gaudi
+ spec:
+ containers:
+ - args:
+ - --model-id
+ - /mnt/models/--meta-llama--CodeLlama-7b-hf/snapshots/b462c3c99b077d341db691ec780a33156f3c1472
+ - --port=8080
+ - --json-output
+ - --max-input-length
+ - "1024"
+ - --max-total-tokens
+ - "2048"
+ env:
+ - name: NUMBA_CACHE_DIR
+ value: /tmp/hf_home
+ - name: HF_HOME
+ value: /tmp/hf_home
+ - name: HF_HUB_CACHE
+ value: /mnt/models
+ - name: HUGGING_FACE_HUB_TOKEN
+ value: "insert-your-huggingface-token-here"
+ image: ghcr.io/huggingface/tgi-gaudi:2.0.4
+ name: kserve-container
+ ports:
+ - containerPort: 8080
+ protocol: TCP
+ resources:
+ limits:
+ habana.ai/gaudi: 1
+ requests:
+ habana.ai/gaudi: 1
+ volumeMounts:
+ - mountPath: /data
+ name: model-volume
+ - mountPath: /var/log/habana_logs
+ name: logs-volume
+ multiModel: false
+ supportedModelFormats:
+ - autoSelect: true
+ name: llm
+ volumes:
+ - emptyDir:
+ sizeLimit: 300Gi
+ name: model-volume
+ - emptyDir:
+ sizeLimit: 500Mi
+ name: logs-volume
diff --git a/CodeGen/openshift-rhoai/manifests/gaudi/ui-server.yaml b/CodeGen/openshift-rhoai/manifests/gaudi/ui-server.yaml
new file mode 100644
index 000000000..29dc9a25c
--- /dev/null
+++ b/CodeGen/openshift-rhoai/manifests/gaudi/ui-server.yaml
@@ -0,0 +1,79 @@
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ labels:
+ io.kompose.service: ui-server
+ name: ui-server
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ io.kompose.service: ui-server
+ template:
+ metadata:
+ labels:
+ io.kompose.service: ui-server
+ spec:
+ initContainers:
+ - name: copy-ui-to-workdir
+ image: "insert-your-image-codegen-ui"
+ command:
+ - /bin/bash
+ - -c
+ args:
+ - |
+ cp -v -r /home/user/* /tmp/temp-data/
+ volumeMounts:
+ - name: temp-data
+ mountPath: /tmp/temp-data
+ containers:
+ - env:
+ - name: HOME
+ value: /tmp/temp-data
+ - name: BASIC_URL
+ value: http://insert-your-codegen-route/v1/codegen
+ image: "insert-your-image-codegen-ui"
+ name: ui-server
+ ports:
+ - containerPort: 5173
+ protocol: TCP
+ workingDir: /tmp/temp-data/svelte
+ volumeMounts:
+ - name: temp-data
+ mountPath: /tmp/temp-data
+ restartPolicy: Always
+ volumes:
+ - name: temp-data
+ emptyDir: {}
+---
+apiVersion: route.openshift.io/v1
+kind: Route
+metadata:
+ labels:
+ io.kompose.service: ui-server
+ name: ui-server
+spec:
+ port:
+ targetPort: 5173
+ to:
+ kind: Service
+ name: ui-server
+ weight: 100
+ wildcardPolicy: None
+---
+apiVersion: v1
+kind: Service
+metadata:
+ labels:
+ io.kompose.service: ui-server
+ name: ui-server
+spec:
+ ports:
+ - name: "5173"
+ port: 5173
+ targetPort: 5173
+ selector:
+ io.kompose.service: ui-server
diff --git a/CodeGen/openshift-rhoai/manifests/xeon/codegen.yaml b/CodeGen/openshift-rhoai/manifests/xeon/codegen.yaml
new file mode 100644
index 000000000..5b8c5a413
--- /dev/null
+++ b/CodeGen/openshift-rhoai/manifests/xeon/codegen.yaml
@@ -0,0 +1,167 @@
+---
+# Source: codegen/charts/llm-uservice/charts/tgi/templates/service.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+ name: codegen-llm-uservice
+ labels:
+ helm.sh/chart: llm-uservice-0.1.0
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.0.0"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ type: ClusterIP
+ ports:
+ - port: 9000
+ targetPort: 9000
+ protocol: TCP
+ name: llm-uservice
+ selector:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+---
+apiVersion: v1
+kind: Service
+metadata:
+ name: codegen
+ labels:
+ helm.sh/chart: codegen-0.1.0
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.0.0"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ type: ClusterIP
+ ports:
+ - port: 7778
+ targetPort: 7778
+ protocol: TCP
+ name: codegen
+ selector:
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: codegen-llm-uservice
+ labels:
+ helm.sh/chart: llm-uservice-0.1.0
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.0.0"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+ spec:
+ securityContext: {}
+ containers:
+ - name: codegen
+ command:
+ - /bin/bash
+ - -c
+ - |
+ cp /usr/lib/ssl/cert.pem /tmp/bundle.crt && \
+ cat /rhoai-ca/tls.crt | tee -a '/tmp/bundle.crt' && \
+ bash ./entrypoint.sh
+ env:
+ - name: TGI_LLM_ENDPOINT
+ value: "insert-your-tgi-url-here"
+ - name: HUGGINGFACEHUB_API_TOKEN
+ value: "insert-your-huggingface-token-here"
+ - name: PYTHONPATH
+ value: /home/user/.local/lib/python3.11/site-packages:/home/user
+ - name: HOME
+ value: /tmp/home
+ - name: SSL_CERT_FILE
+ value: /tmp/bundle.crt
+ securityContext: {}
+ image: "insert-your-image-llm-tgi"
+ imagePullPolicy: IfNotPresent
+ ports:
+ - name: llm-uservice
+ containerPort: 9000
+ protocol: TCP
+ volumeMounts:
+ - mountPath: /tmp/home
+ name: local-dir
+ - mountPath: /rhoai-ca
+ name: odh-ca-bundle
+ resources: {}
+ volumes:
+ - emptyDir:
+ sizeLimit: 5Gi
+ name: local-dir
+ - name: odh-ca-bundle
+ secret:
+ defaultMode: 420
+ secretName: rhoai-ca-bundle
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: codegen
+ labels:
+ helm.sh/chart: codegen-0.1.0
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.0.0"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+ spec:
+ securityContext: null
+ containers:
+ - name: codegen
+ env:
+ - name: LLM_SERVICE_HOST_IP
+ value: codegen-llm-uservice
+ securityContext: null
+ image: "insert-your-image-codegen"
+ imagePullPolicy: IfNotPresent
+ ports:
+ - name: codegen
+ containerPort: 7778
+ protocol: TCP
+ resources: null
+---
+apiVersion: route.openshift.io/v1
+kind: Route
+metadata:
+ labels:
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/managed-by: Helm
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/version: 1.0.0
+ helm.sh/chart: codegen-0.1.0
+ name: codegen
+spec:
+ port:
+ targetPort: codegen
+ to:
+ kind: Service
+ name: codegen
+ weight: 100
+ wildcardPolicy: None
diff --git a/CodeGen/openshift-rhoai/manifests/xeon/servingruntime-magicoder.yaml b/CodeGen/openshift-rhoai/manifests/xeon/servingruntime-magicoder.yaml
new file mode 100644
index 000000000..3e7a5ebf8
--- /dev/null
+++ b/CodeGen/openshift-rhoai/manifests/xeon/servingruntime-magicoder.yaml
@@ -0,0 +1,87 @@
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+---
+apiVersion: template.openshift.io/v1
+kind: Template
+metadata:
+ annotations:
+ opendatahub.io/apiProtocol: REST
+ opendatahub.io/modelServingSupport: '["single"]'
+ labels:
+ opendatahub.io/dashboard: "true"
+ name: tgi-magicoder-s-ds-6.7b-cpu
+ namespace: redhat-ods-applications
+objects:
+- apiVersion: serving.kserve.io/v1alpha1
+ kind: ServingRuntime
+ labels:
+ opendatahub.io/dashboard: "true"
+ metadata:
+ annotations:
+ openshift.io/display-name: Text Generation Inference Magicoder-S-DS-6.7B on CPU
+ name: tgi-magicoder-s-ds-6.7b-cpu
+ spec:
+ containers:
+ - args:
+ - --model-id
+ - /mnt/models/--ise-uiuc--Magicoder-S-DS-6.7B/snapshots/b3ed7cb1578a3643ceaf2ebf996a3d8e85f75d8f
+ - --port=8080
+ - --json-output
+ env:
+ - name: NUMBA_CACHE_DIR
+ value: /tmp/hf_home
+ - name: HF_HOME
+ value: /tmp/hf_home
+ - name: HF_HUB_CACHE
+ value: /mnt/models
+ - name: HUGGING_FACE_HUB_TOKEN
+ value: "insert-your-huggingface-token-here"
+ - name: BATCH_BUCKET_SIZE
+ value: "22"
+ - name: PREFILL_BATCH_BUCKET_SIZE
+ value: "1"
+ - name: MAX_BATCH_PREFILL_TOKENS
+ value: "5102"
+ - name: MAX_BATCH_TOTAL_TOKENS
+ value: "32256"
+ - name: MAX_INPUT_LENGTH
+ value: "1024"
+ - name: PAD_SEQUENCE_TO_MULTIPLE_OF
+ value: "1024"
+ - name: MAX_WAITING_TOKENS
+ value: "5"
+ - name: OMPI_MCA_btl_vader_single_copy_mechanism
+ value: none
+ image: ghcr.io/huggingface/text-generation-inference:2.1.0
+ livenessProbe:
+ exec:
+ command:
+ - curl
+ - localhost:8080/health
+ initialDelaySeconds: 500
+ name: kserve-container
+ ports:
+ - containerPort: 8080
+ protocol: TCP
+ readinessProbe:
+ exec:
+ command:
+ - curl
+ - localhost:8080/health
+ initialDelaySeconds: 500
+ volumeMounts:
+ - mountPath: /data
+ name: model-volume
+ multiModel: false
+ supportedModelFormats:
+ - autoSelect: true
+ name: llm
+ volumes:
+ - emptyDir:
+ sizeLimit: 300Gi
+ name: model-volume
+ - emptyDir:
+ medium: Memory
+ sizeLimit: 40Gi
+ name: shm
diff --git a/CodeGen/openshift-rhoai/manifests/xeon/ui-server.yaml b/CodeGen/openshift-rhoai/manifests/xeon/ui-server.yaml
new file mode 100644
index 000000000..29dc9a25c
--- /dev/null
+++ b/CodeGen/openshift-rhoai/manifests/xeon/ui-server.yaml
@@ -0,0 +1,79 @@
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ labels:
+ io.kompose.service: ui-server
+ name: ui-server
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ io.kompose.service: ui-server
+ template:
+ metadata:
+ labels:
+ io.kompose.service: ui-server
+ spec:
+ initContainers:
+ - name: copy-ui-to-workdir
+ image: "insert-your-image-codegen-ui"
+ command:
+ - /bin/bash
+ - -c
+ args:
+ - |
+ cp -v -r /home/user/* /tmp/temp-data/
+ volumeMounts:
+ - name: temp-data
+ mountPath: /tmp/temp-data
+ containers:
+ - env:
+ - name: HOME
+ value: /tmp/temp-data
+ - name: BASIC_URL
+ value: http://insert-your-codegen-route/v1/codegen
+ image: "insert-your-image-codegen-ui"
+ name: ui-server
+ ports:
+ - containerPort: 5173
+ protocol: TCP
+ workingDir: /tmp/temp-data/svelte
+ volumeMounts:
+ - name: temp-data
+ mountPath: /tmp/temp-data
+ restartPolicy: Always
+ volumes:
+ - name: temp-data
+ emptyDir: {}
+---
+apiVersion: route.openshift.io/v1
+kind: Route
+metadata:
+ labels:
+ io.kompose.service: ui-server
+ name: ui-server
+spec:
+ port:
+ targetPort: 5173
+ to:
+ kind: Service
+ name: ui-server
+ weight: 100
+ wildcardPolicy: None
+---
+apiVersion: v1
+kind: Service
+metadata:
+ labels:
+ io.kompose.service: ui-server
+ name: ui-server
+spec:
+ ports:
+ - name: "5173"
+ port: 5173
+ targetPort: 5173
+ selector:
+ io.kompose.service: ui-server
diff --git a/CodeGen/openshift/manifests/README.md b/CodeGen/openshift/manifests/README.md
new file mode 100644
index 000000000..2d49ce36c
--- /dev/null
+++ b/CodeGen/openshift/manifests/README.md
@@ -0,0 +1,133 @@
+Deploy CodeGen on OpenShift Cluster
+
+## Prerequisites
+
+1. **Red Hat OpenShift Cluster** with dynamic _StorageClass_ to provision _PersistentVolumes_ e.g. **OpenShift Data Foundation**)
+2. Exposed image registry to push there docker images (https://docs.openshift.com/container-platform/4.16/registry/securing-exposing-registry.html).
+3. Account on https://huggingface.co/, access to model _ise-uiuc/Magicoder-S-DS-6.7B_ (for Xeon) or _meta-llama/CodeLlama-7b-hf_ (for Gaugi) and token with _Read permissions_. Update the access token in your repository using following commands.
+
+On Xeon:
+
+```
+cd GenAIExamples/CodeGen/openshift/manifests/xeon
+export HFTOKEN="YourOwnToken"
+sed -i "s/insert-your-huggingface-token-here/${HFTOKEN}/g" codegen.yaml
+```
+
+On Gaudi:
+
+```
+cd GenAIExamples/CodeGen/openshift/manifests/gaudi
+export HFTOKEN="YourOwnToken"
+sed -i "s/insert-your-huggingface-token-here/${HFTOKEN}/g" codegen.yaml
+```
+
+## Deploy CodeGen
+
+1. Build docker images locally
+
+- LLM Docker Image:
+
+```
+git clone https://github.com/opea-project/GenAIComps.git
+cd GenAIComps
+docker build -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile .
+```
+
+- MegaService Docker Image:
+
+```
+git clone https://github.com/opea-project/GenAIExamples
+cd GenAIExamples/CodeGen
+docker build -t opea/codegen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
+```
+
+- UI Docker Image:
+
+```
+cd GenAIExamples/CodeGen/ui
+docker build -t opea/codegen-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
+```
+
+To verify run the command: `docker images`.
+
+2. Login to docker, tag the images and push it to image registry with following commands:
+
+```
+docker login -u -p $(oc whoami -t)
+docker tag //:
+docker push //:
+```
+
+To verify run the command: `oc get istag`.
+
+3. Update images names in manifest files.
+
+On Xeon:
+
+```
+cd GenAIExamples/CodeGen/openshift/manifests/xeon
+export IMAGE_LLM_TGI="YourImage"
+export IMAGE_CODEGEN="YourImage"
+export IMAGE_CODEGEN_UI="YourImage"
+sed -i "s#insert-your-image-llm-tgi#${IMAGE_LLM_TGI}#g" codegen.yaml
+sed -i "s#insert-your-image-codegen#${IMAGE_CODEGEN}#g" codegen.yaml
+sed -i "s#insert-your-image-codegen-ui#${IMAGE_CODEGEN_UI}#g" ui-server.yaml
+```
+
+On Gaudi:
+
+```
+cd GenAIExamples/CodeGen/openshift/manifests/gaudi
+export IMAGE_LLM_TGI="YourImage"
+export IMAGE_CODEGEN="YourImage"
+export IMAGE_CODEGEN_UI="YourImage"
+sed -i "s#insert-your-image-llm-tgi#${IMAGE_LLM_TGI}#g" codegen.yaml
+sed -i "s#insert-your-image-codegen#${IMAGE_CODEGEN}#g" codegen.yaml
+sed -i "s#insert-your-image-codegen-ui#${IMAGE_CODEGEN_UI}#g" ui-server.yaml
+```
+
+4. Deploy CodeGen with command:
+
+```
+oc apply -f codegen.yaml
+```
+
+5. Check the _codegen_ route with command `oc get routes` and update the route in _ui-server.yaml_ file.
+
+On Xeon:
+
+```
+cd GenAIExamples/CodeGen/openshift/manifests/xeon
+export CODEGEN_ROUTE="YourCodegenRoute"
+sed -i "s/insert-your-codegen-route/${CODEGEN_ROUTE}/g" ui-server.yaml
+```
+
+On Gaudi:
+
+```
+cd GenAIExamples/CodeGen/openshift/manifests/gaudi
+export CODEGEN_ROUTE="YourCodegenRoute"
+sed -i "s/insert-your-codegen-route/${CODEGEN_ROUTE}/g" ui-server.yaml
+```
+
+6. Deploy UI with command:
+
+```
+oc apply -f ui-server.yaml
+```
+
+## Verify Services
+
+Make sure all the pods are running and READY 1/1 (it takes about 5 minutes).
+
+```
+oc get pods
+curl http://${CODEGEN_ROUTE}/v1/codegen -H "Content-Type: application/json" -d '{
+ "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."
+ }'
+```
+
+## Launch the UI
+
+To access the frontend, find the route for _ui-server_ with command `oc get routes` and open it in the browser.
diff --git a/CodeGen/openshift/manifests/gaudi/codegen.yaml b/CodeGen/openshift/manifests/gaudi/codegen.yaml
new file mode 100644
index 000000000..164fe2e2a
--- /dev/null
+++ b/CodeGen/openshift/manifests/gaudi/codegen.yaml
@@ -0,0 +1,247 @@
+---
+# Source: codegen/charts/llm-uservice/charts/tgi/templates/service.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+ name: codegen-tgi
+ labels:
+ helm.sh/chart: tgi-0.1.0
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.4"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ type: ClusterIP
+ ports:
+ - port: 8080
+ targetPort: 8080
+ protocol: TCP
+ name: tgi
+ selector:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: codegen
+---
+apiVersion: v1
+kind: Service
+metadata:
+ name: codegen-llm-uservice
+ labels:
+ helm.sh/chart: llm-uservice-0.1.0
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.0.0"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ type: ClusterIP
+ ports:
+ - port: 9000
+ targetPort: 9000
+ protocol: TCP
+ name: llm-uservice
+ selector:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+---
+apiVersion: v1
+kind: Service
+metadata:
+ name: codegen
+ labels:
+ helm.sh/chart: codegen-0.1.0
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.0.0"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ type: ClusterIP
+ ports:
+ - port: 7778
+ targetPort: 7778
+ protocol: TCP
+ name: codegen
+ selector:
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: codegen-tgi
+ labels:
+ helm.sh/chart: tgi-0.1.0
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.4"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: codegen
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: codegen
+ spec:
+ securityContext: {}
+ containers:
+ - name: tgi
+ env:
+ - name: MODEL_ID
+ value: meta-llama/CodeLlama-7b-hf
+ - name: PORT
+ value: "8080"
+ - name: NUMBA_CACHE_DIR
+ value: /data
+ - name: HF_TOKEN
+ value: "insert-your-huggingface-token-here"
+ - name: PAD_SEQUENCE_TO_MULTIPLE_OF
+ value: "1024"
+ securityContext: {}
+ image: "ghcr.io/huggingface/tgi-gaudi:2.0.4"
+ imagePullPolicy: IfNotPresent
+ args:
+ - --max-input-length
+ - "1024"
+ - --max-total-tokens
+ - "2048"
+ volumeMounts:
+ - mountPath: /data
+ name: model-volume
+ - mountPath: /var/log/habana_logs
+ name: logs-volume
+ ports:
+ - name: http
+ containerPort: 8080
+ protocol: TCP
+ resources:
+ limits:
+ habana.ai/gaudi: 1
+ requests:
+ habana.ai/gaudi: 1
+ volumes:
+ - emptyDir:
+ sizeLimit: 50Gi
+ name: model-volume
+ - emptyDir:
+ name: logs-volume
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: codegen-llm-uservice
+ labels:
+ helm.sh/chart: llm-uservice-0.1.0
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.0.0"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+ spec:
+ securityContext: {}
+ containers:
+ - name: codegen
+ env:
+ - name: TGI_LLM_ENDPOINT
+ value: "http://codegen-tgi:8080"
+ - name: HUGGINGFACEHUB_API_TOKEN
+ value: "insert-your-huggingface-token-here"
+ - name: PYTHONPATH
+ value: /home/user/.local/lib/python3.11/site-packages:/home/user
+ - name: HOME
+ value: /tmp/home
+ securityContext: {}
+ image: "insert-your-image-llm-tgi"
+ imagePullPolicy: IfNotPresent
+ ports:
+ - name: llm-uservice
+ containerPort: 9000
+ protocol: TCP
+ startupProbe:
+ exec:
+ command:
+ - python
+ - -c
+ - 'import requests; req = requests.get("http://codegen-tgi:8080/info"); print(req)'
+ initialDelaySeconds: 5
+ periodSeconds: 5
+ failureThreshold: 120
+ volumeMounts:
+ - mountPath: /tmp/home
+ name: local-dir
+ resources: {}
+ volumes:
+ - emptyDir:
+ sizeLimit: 5Gi
+ name: local-dir
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: codegen
+ labels:
+ helm.sh/chart: codegen-0.1.0
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.0.0"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+ spec:
+ securityContext: null
+ containers:
+ - name: codegen
+ env:
+ - name: LLM_SERVICE_HOST_IP
+ value: codegen-llm-uservice
+ securityContext: null
+ image: "insert-your-image-codegen"
+ imagePullPolicy: IfNotPresent
+ ports:
+ - name: codegen
+ containerPort: 7778
+ protocol: TCP
+ resources: null
+---
+apiVersion: route.openshift.io/v1
+kind: Route
+metadata:
+ labels:
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/managed-by: Helm
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/version: 1.0.0
+ helm.sh/chart: codegen-0.1.0
+ name: codegen
+spec:
+ port:
+ targetPort: codegen
+ to:
+ kind: Service
+ name: codegen
+ weight: 100
+ wildcardPolicy: None
diff --git a/CodeGen/openshift/manifests/gaudi/ui-server.yaml b/CodeGen/openshift/manifests/gaudi/ui-server.yaml
new file mode 100644
index 000000000..29dc9a25c
--- /dev/null
+++ b/CodeGen/openshift/manifests/gaudi/ui-server.yaml
@@ -0,0 +1,79 @@
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ labels:
+ io.kompose.service: ui-server
+ name: ui-server
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ io.kompose.service: ui-server
+ template:
+ metadata:
+ labels:
+ io.kompose.service: ui-server
+ spec:
+ initContainers:
+ - name: copy-ui-to-workdir
+ image: "insert-your-image-codegen-ui"
+ command:
+ - /bin/bash
+ - -c
+ args:
+ - |
+ cp -v -r /home/user/* /tmp/temp-data/
+ volumeMounts:
+ - name: temp-data
+ mountPath: /tmp/temp-data
+ containers:
+ - env:
+ - name: HOME
+ value: /tmp/temp-data
+ - name: BASIC_URL
+ value: http://insert-your-codegen-route/v1/codegen
+ image: "insert-your-image-codegen-ui"
+ name: ui-server
+ ports:
+ - containerPort: 5173
+ protocol: TCP
+ workingDir: /tmp/temp-data/svelte
+ volumeMounts:
+ - name: temp-data
+ mountPath: /tmp/temp-data
+ restartPolicy: Always
+ volumes:
+ - name: temp-data
+ emptyDir: {}
+---
+apiVersion: route.openshift.io/v1
+kind: Route
+metadata:
+ labels:
+ io.kompose.service: ui-server
+ name: ui-server
+spec:
+ port:
+ targetPort: 5173
+ to:
+ kind: Service
+ name: ui-server
+ weight: 100
+ wildcardPolicy: None
+---
+apiVersion: v1
+kind: Service
+metadata:
+ labels:
+ io.kompose.service: ui-server
+ name: ui-server
+spec:
+ ports:
+ - name: "5173"
+ port: 5173
+ targetPort: 5173
+ selector:
+ io.kompose.service: ui-server
diff --git a/CodeGen/openshift/manifests/xeon/codegen.yaml b/CodeGen/openshift/manifests/xeon/codegen.yaml
new file mode 100644
index 000000000..fe6be8c18
--- /dev/null
+++ b/CodeGen/openshift/manifests/xeon/codegen.yaml
@@ -0,0 +1,233 @@
+---
+# Source: codegen/charts/llm-uservice/charts/tgi/templates/service.yaml
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: Service
+metadata:
+ name: codegen-tgi
+ labels:
+ helm.sh/chart: tgi-0.1.0
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.4"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ type: ClusterIP
+ ports:
+ - port: 8080
+ targetPort: 8080
+ protocol: TCP
+ name: tgi
+ selector:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: codegen
+---
+apiVersion: v1
+kind: Service
+metadata:
+ name: codegen-llm-uservice
+ labels:
+ helm.sh/chart: llm-uservice-0.1.0
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.0.0"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ type: ClusterIP
+ ports:
+ - port: 9000
+ targetPort: 9000
+ protocol: TCP
+ name: llm-uservice
+ selector:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+---
+apiVersion: v1
+kind: Service
+metadata:
+ name: codegen
+ labels:
+ helm.sh/chart: codegen-0.1.0
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.0.0"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ type: ClusterIP
+ ports:
+ - port: 7778
+ targetPort: 7778
+ protocol: TCP
+ name: codegen
+ selector:
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: codegen-tgi
+ labels:
+ helm.sh/chart: tgi-0.1.0
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.4"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: codegen
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: tgi
+ app.kubernetes.io/instance: codegen
+ spec:
+ securityContext: {}
+ containers:
+ - name: tgi
+ env:
+ - name: MODEL_ID
+ value: ise-uiuc/Magicoder-S-DS-6.7B
+ - name: PORT
+ value: "8080"
+ - name: NUMBA_CACHE_DIR
+ value: /data
+ - name: HF_TOKEN
+ value: "insert-your-huggingface-token-here"
+ securityContext: {}
+ image: "ghcr.io/huggingface/text-generation-inference:2.1.0"
+ imagePullPolicy: IfNotPresent
+ volumeMounts:
+ - mountPath: /data
+ name: model-volume
+ ports:
+ - name: http
+ containerPort: 8080
+ protocol: TCP
+ resources: {}
+ volumes:
+ - emptyDir:
+ sizeLimit: 50Gi
+ name: model-volume
+
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: codegen-llm-uservice
+ labels:
+ helm.sh/chart: llm-uservice-0.1.0
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.0.0"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: llm-uservice
+ app.kubernetes.io/instance: codegen
+ spec:
+ securityContext: {}
+ containers:
+ - name: codegen
+ env:
+ - name: TGI_LLM_ENDPOINT
+ value: "http://codegen-tgi:8080"
+ - name: HUGGINGFACEHUB_API_TOKEN
+ value: "insert-your-huggingface-token-here"
+ - name: PYTHONPATH
+ value: /home/user/.local/lib/python3.11/site-packages:/home/user
+ - name: HOME
+ value: /tmp/home
+ securityContext: {}
+ image: "insert-your-image-llm-tgi"
+ imagePullPolicy: IfNotPresent
+ ports:
+ - name: llm-uservice
+ containerPort: 9000
+ protocol: TCP
+ startupProbe:
+ exec:
+ command:
+ - python
+ - -c
+ - 'import requests; req = requests.get("http://codegen-tgi:8080"); print(req)'
+ initialDelaySeconds: 5
+ periodSeconds: 5
+ failureThreshold: 120
+ volumeMounts:
+ - mountPath: /tmp/home
+ name: local-dir
+ resources: {}
+ volumes:
+ - emptyDir:
+ sizeLimit: 5Gi
+ name: local-dir
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: codegen
+ labels:
+ helm.sh/chart: codegen-0.1.0
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/version: "1.0.0"
+ app.kubernetes.io/managed-by: Helm
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+ template:
+ metadata:
+ labels:
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/instance: codegen
+ spec:
+ securityContext: null
+ containers:
+ - name: codegen
+ env:
+ - name: LLM_SERVICE_HOST_IP
+ value: codegen-llm-uservice
+ securityContext: null
+ image: "insert-your-image-codegen"
+ imagePullPolicy: IfNotPresent
+ ports:
+ - name: codegen
+ containerPort: 7778
+ protocol: TCP
+ resources: null
+---
+apiVersion: route.openshift.io/v1
+kind: Route
+metadata:
+ labels:
+ app.kubernetes.io/instance: codegen
+ app.kubernetes.io/managed-by: Helm
+ app.kubernetes.io/name: codegen
+ app.kubernetes.io/version: 1.0.0
+ helm.sh/chart: codegen-0.1.0
+ name: codegen
+spec:
+ port:
+ targetPort: codegen
+ to:
+ kind: Service
+ name: codegen
+ weight: 100
+ wildcardPolicy: None
diff --git a/CodeGen/openshift/manifests/xeon/ui-server.yaml b/CodeGen/openshift/manifests/xeon/ui-server.yaml
new file mode 100644
index 000000000..29dc9a25c
--- /dev/null
+++ b/CodeGen/openshift/manifests/xeon/ui-server.yaml
@@ -0,0 +1,79 @@
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ labels:
+ io.kompose.service: ui-server
+ name: ui-server
+spec:
+ replicas: 1
+ selector:
+ matchLabels:
+ io.kompose.service: ui-server
+ template:
+ metadata:
+ labels:
+ io.kompose.service: ui-server
+ spec:
+ initContainers:
+ - name: copy-ui-to-workdir
+ image: "insert-your-image-codegen-ui"
+ command:
+ - /bin/bash
+ - -c
+ args:
+ - |
+ cp -v -r /home/user/* /tmp/temp-data/
+ volumeMounts:
+ - name: temp-data
+ mountPath: /tmp/temp-data
+ containers:
+ - env:
+ - name: HOME
+ value: /tmp/temp-data
+ - name: BASIC_URL
+ value: http://insert-your-codegen-route/v1/codegen
+ image: "insert-your-image-codegen-ui"
+ name: ui-server
+ ports:
+ - containerPort: 5173
+ protocol: TCP
+ workingDir: /tmp/temp-data/svelte
+ volumeMounts:
+ - name: temp-data
+ mountPath: /tmp/temp-data
+ restartPolicy: Always
+ volumes:
+ - name: temp-data
+ emptyDir: {}
+---
+apiVersion: route.openshift.io/v1
+kind: Route
+metadata:
+ labels:
+ io.kompose.service: ui-server
+ name: ui-server
+spec:
+ port:
+ targetPort: 5173
+ to:
+ kind: Service
+ name: ui-server
+ weight: 100
+ wildcardPolicy: None
+---
+apiVersion: v1
+kind: Service
+metadata:
+ labels:
+ io.kompose.service: ui-server
+ name: ui-server
+spec:
+ ports:
+ - name: "5173"
+ port: 5173
+ targetPort: 5173
+ selector:
+ io.kompose.service: ui-server