Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CodeGen on OpenShift #509

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions helm-charts/codegen-openshift-rhoai/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v2
name: codegen
description: A Helm chart for deploying codegen on Red Hat OpenShift with Red Hat OpenShift AI
dependencies:
- name: llm-uservice
version: 1.0.0
repository: "file://llm-uservice"
- name: react-ui
version: 0.1.0
repository: "file://react-ui"
type: application
version: 1.0.0
124 changes: 124 additions & 0 deletions helm-charts/codegen-openshift-rhoai/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# CodeGen

Helm chart for deploying CodeGen service on Red Hat OpenShift with Red Hat OpenShift AI.

Serving runtime template in this example uses model _ise-uiuc/Magicoder-S-DS-6.7B_ for Xeon and _meta-llama/CodeLlama-7b-hf_ for Gaudi.

## Prerequisites

1. **Red Hat OpenShift Cluster** with dynamic _StorageClass_ to provision _PersistentVolumes_ e.g. **OpenShift Data Foundation**) and installed Operators: **Red Hat - Authorino (Technical Preview)**, **Red Hat OpenShift Service Mesh**, **Red Hat OpenShift Serverless** and **Red Hat Openshift AI**.
2. Image registry to push there docker images (https://docs.openshift.com/container-platform/4.16/registry/securing-exposing-registry.html).
3. Access to S3-compatible object storage bucket (e.g. **OpenShift Data Foundation**, **AWS S3**) and values of access and secret access keys and S3 endpoint (https://docs.redhat.com/en/documentation/red_hat_openshift_data_foundation/4.16/html/managing_hybrid_and_multicloud_resources/accessing-the-multicloud-object-gateway-with-your-applications_rhodf#accessing-the-multicloud-object-gateway-with-your-applications_rhodf).
4. Account on https://huggingface.co/, access to model _ise-uiuc/Magicoder-S-DS-6.7B_ (for Xeon) or _meta-llama/CodeLlama-7b-hf_ (for Gaudi) and token with Read permissions.

## Deploy model in Red Hat Openshift AI

1. Login to OpenShift CLI and run following commands to create new serving runtime and _hf-token_ secret.

```
cd GenAIInfra/helm-charts/codegen-openshift-rhoai/
export HFTOKEN="insert-your-huggingface-token-here"

On Xeon:
helm install servingruntime tgi --set global.huggingfacehubApiToken=${HFTOKEN}

On Gaudi:
helm install servingruntime tgi --set global.huggingfacehubApiToken=${HFTOKEN} --values tgi/gaudi-values.yaml
```

Verify if template has been created with `oc get template -n redhat-ods-applications` command.

2. Find the route for **Red Hat OpenShift AI** dashboard with below command and open it in the browser:

```
oc get routes -A | grep rhods-dashboard
```

3. Go to **Data Science Project** and click **Create data science project**. Fill the **Name** and click **Create**.
4. Go to **Workbenches** tab and click **Create workbench**. Fill the **Name**, under **Notebook image** choose _Standard Data Science_, under **Cluster storage** choose _Create new persistent storage_ and change **Persistent storage size** to 40 GB. Click **Create workbench**.
5. Open newly created Jupiter notebook and run following commands to download the model and upload it on s3:

```
%env S3_ENDPOINT=<S3_RGW_ROUTE>
%env S3_ACCESS_KEY=<AWS_ACCESS_KEY_ID>
%env S3_SECRET_KEY=<AWS_SECRET_ACCESS_KEY>
%env HF_TOKEN=<PASTE_HUGGINGFACE_TOKEN>
```

```
!pip install huggingface-hub
```

```
import os
import boto3
import botocore
import glob
from huggingface_hub import snapshot_download
bucket_name = 'first.bucket'
s3_endpoint = os.environ.get('S3_ENDPOINT')
s3_accesskey = os.environ.get('S3_ACCESS_KEY')
s3_secretkey = os.environ.get('S3_SECRET_KEY')
path = 'models'
hf_token = os.environ.get('HF_TOKEN')
session = boto3.session.Session()
s3_resource = session.resource('s3',
endpoint_url=s3_endpoint,
verify=False,
aws_access_key_id=s3_accesskey,
aws_secret_access_key=s3_secretkey)
bucket = s3_resource.Bucket(bucket_name)
```

For Xeon download _ise-uiuc/Magicoder-S-DS-6.7B_:

```
snapshot_download("ise-uiuc/Magicoder-S-DS-6.7B", cache_dir=f'./models', token=hf_token)
```

For Gaudi download _meta-llama/CodeLlama-7b-hf_:

```
snapshot_download("meta-llama/CodeLlama-7b-hf", cache_dir=f'./models', token=hf_token)
```

Upload the downloaded model to S3:

```
files = (file for file in glob.glob(f'{path}/**/*', recursive=True) if os.path.isfile(file) and "snapshots" in file)
for filename in files:
s3_name = filename.replace(path, '')
print(f'Uploading: {filename} to {path}{s3_name}')
bucket.upload_file(filename, f'{path}{s3_name}')
```

6. Go to your project in **Red Hat OpenShift AI** dashboard, then "Models" tab and click **Deploy model** under _Single-model serving platform_. Fill the **Name**, choose newly created **Serving runtime**: _Text Generation Inference Magicoder-S-DS-6.7B on CPU_ (for Xeon) or _Text Generation Inference CodeLlama-7b-hf on Gaudi_ (for Gaudi), **Model framework**: _llm_ and change **Model server size** to _Custom_: 16 CPUs and 64 Gi memory. For deployment with Gaudi select proper **Accelerator**. Click the checkbox to create external route in **Model route** section and uncheck the **Token authentication**. Under **Model location** choose _New data connection_ and fill all required fields for s3 access, **Bucket** _first.bucket_ and **Path**: _models_. Click **Deploy**. It takes about 10 minutes to get _Loaded_ status.\
If it's not going to _Loaded_ status and revision changed status to "ProgressDeadlineExceeded" (`oc get revision`), scale model deployment to 0 and than to 1 with command `oc scale deployment.apps/<model_deployment_name> --replicas=1` and wait about 10 minutes for deployment.

## Install the Chart

To install the chart, back to OpenShift CLI, go to your project and run the following:

```console
cd GenAIInfra/helm-charts/

export NAMESPACE="insert-your-namespace-here"
export CLUSTERDOMAIN="$(oc get Ingress.config.openshift.io/cluster -o jsonpath='{.spec.domain}' | sed 's/^apps.//')"
export MODELNAME="insert-name-of-deployed-model-here" (it refers to the *Name* from step 6 in **Deploy model in Red Hat Openshift AI**)
export PROJECT="insert-project-name-where-model-is-deployed"

sed -i "s/insert-your-namespace-here/${NAMESPACE}/g" codegen-openshift-rhoai/llm-uservice/values.yaml

./update_dependency.sh
helm dependency update codegen-openshift-rhoai

helm install codegen codegen-openshift-rhoai --set image.repository=image-registry.openshift-image-registry.svc:5000/${NAMESPACE}/codegen --set llm-uservice.image.repository=image-registry.openshift-image-registry.svc:5000/${NAMESPACE}/llm-tgi --set react-ui.image.repository=image-registry.openshift-image-registry.svc:5000/${NAMESPACE}/react-ui --set global.clusterDomain=${CLUSTERDOMAIN} --set global.huggingfacehubApiToken=${HFTOKEN} --set llm-uservice.servingRuntime.name=${MODELNAME} --set llm-uservice.servingRuntime.namespace=${PROJECT}
```

## Verify

To verify the installation, run the command `oc get pods` to make sure all pods are running. Wait about 5 minutes for building images. When 4 pods achieve _Completed_ status, the rest with services should go to _Running_.

## Launch the UI

To access the frontend, find the route for _react-ui_ with command `oc get routes` and open it in the browser.
8 changes: 8 additions & 0 deletions helm-charts/codegen-openshift-rhoai/llm-uservice/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v2
name: llm-uservice
description: A Helm chart for deploying llm-uservice on Red Hat OpenShift with Red Hat OpenShift AI
type: application
version: 1.0.0
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "llm-uservice.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "llm-uservice.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}

{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "llm-uservice.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Common labels
*/}}
{{- define "llm-uservice.labels" -}}
helm.sh/chart: {{ include "llm-uservice.chart" . }}
{{ include "llm-uservice.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}

{{/*
Selector labels
*/}}
{{- define "llm-uservice.selectorLabels" -}}
app.kubernetes.io/name: {{ include "llm-uservice.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

kind: BuildConfig
apiVersion: build.openshift.io/v1
metadata:
name: {{ include "llm-uservice.fullname" . }}
namespace: {{ .Release.Namespace }}
spec:
output:
to:
kind: "ImageStreamTag"
name: "llm-tgi:latest"
failedBuildsHistoryLimit: 5
successfulBuildsHistoryLimit: 5
nodeSelector: null
postCommit: {}
resources: {}
runPolicy: SerialLatestOnly
source:
git:
ref: {{ .Values.source.gitRef }}
uri: {{ .Values.source.gitUri }}
type: {{ .Values.source.type }}
strategy:
type: {{ .Values.strategy.type }}
dockerStrategy:
dockerfilePath: {{ .Values.strategy.dockerfilePath }}
triggers:
- type: ConfigChange
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v1
kind: ConfigMap
metadata:
name: create-rhoai-istio-rootca-cert-secret
data:
create-rhoai-istio-rootca-cert-secret.sh: |
#!/bin/bash
EXISTS=$(oc get secret --ignore-not-found rhoai-ca-bundle)

if [[ -z "${EXISTS}" ]]; then
oc create secret generic -n {{ .Release.Namespace }} rhoai-ca-bundle --from-literal=tls.crt="$(oc extract secret/knative-serving-cert -n istio-system --to=- --keys=tls.crt)"
else
echo "oc get secret --ignore-not-found rhoai-ca-bundle returned non-empty string, not creating a secret"
fi
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "llm-uservice.fullname" . }}
labels:
{{- include "llm-uservice.labels" . | nindent 4 }}
annotations:
image.openshift.io/triggers: '[{"from":{"kind":"ImageStreamTag","name":"llm-tgi:latest"},"fieldPath":"spec.template.spec.containers[?(@.name==\"codegen\")].image"}]'
spec:
replicas: 1
selector:
matchLabels:
{{- include "llm-uservice.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "llm-uservice.selectorLabels" . | nindent 8 }}
spec:
securityContext: {}
containers:
- name: {{ .Release.Name }}
command:
- /bin/bash
- -c
- |
cp /usr/lib/ssl/cert.pem /tmp/bundle.crt && \
cat /rhoai-ca/tls.crt | tee -a '/tmp/bundle.crt' && \
bash ./entrypoint.sh
env:
- name: TGI_LLM_ENDPOINT
value: "https://{{ .Values.servingRuntime.name }}-{{ .Values.servingRuntime.namespace }}.apps.{{ .Values.global.clusterDomain }}"
- name: HUGGINGFACEHUB_API_TOKEN
valueFrom:
secretKeyRef:
key: HUGGING_FACE_HUB_TOKEN
name: hf-token
- name: PYTHONPATH
value: {{ .Values.PYTHONPATH | quote }}
- name: HOME
value: {{ .Values.HOME | quote }}
- name: SSL_CERT_FILE
value: /tmp/bundle.crt
securityContext: {}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: llm-uservice
containerPort: 9000
protocol: TCP
volumeMounts:
- mountPath: /tmp/home
name: local-dir
- mountPath: /rhoai-ca
name: odh-ca-bundle
resources: {}
volumes:
- emptyDir:
sizeLimit: 5Gi
name: local-dir
- name: odh-ca-bundle
secret:
defaultMode: 420
secretName: rhoai-ca-bundle
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

---
apiVersion: image.openshift.io/v1
kind: ImageStream
metadata:
name: llm-tgi
namespace: {{ .Release.Namespace }}
spec:
lookupPolicy:
local: true
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: batch/v1
kind: Job
metadata:
name: create-rhoai-istio-rootca-cert-secret
spec:
template:
spec:
containers:
- image: {{ .Values.job.image }}
command:
- /bin/bash
- -c
- |
oc wait --for=condition=ReconcileComplete=True dsc/rhods-datasciencecluster --timeout=-1s
oc wait --for condition=Ready=True knativeserving -n knative-serving knative-serving --timeout=-1s
'/tmp/create-rhoai-istio-rootca-cert-secret.sh'
name: create-rhoai-istio-rootca-cert-secret
volumeMounts:
- mountPath: /tmp/create-rhoai-istio-rootca-cert-secret.sh
name: create-rhoai-istio-rootca-cert-secret
subPath: create-rhoai-istio-rootca-cert-secret.sh
volumes:
- name: create-rhoai-istio-rootca-cert-secret
configMap:
name: create-rhoai-istio-rootca-cert-secret
defaultMode: 0755
dnsPolicy: ClusterFirst
restartPolicy: Never
serviceAccount: {{ .Values.serviceAccountName }}
serviceAccountName: {{ .Values.serviceAccountName }}
terminationGracePeriodSeconds: 400
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kubeadmin-rhoai-cluster-admin-rolebinding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: User
apiGroup: rbac.authorization.k8s.io
name: 'kube:admin'
Loading