Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration Suggestion #959

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions helm-charts/agent/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
19 changes: 19 additions & 0 deletions helm-charts/agent/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v2
name: agent
description: The Helm chart for deploying agent microservice
type: application
version: 0-latest
# The llm microservice server version
appVersion: "v1.0"
dependencies:
- name: tgi
version: 0-latest
repository: file://../tgi
condition: tgi.enabled
- name: vllm
version: 0-latest
repository: file://../vllm
condition: vllm.enabled
46 changes: 46 additions & 0 deletions helm-charts/agent/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# agent

Helm chart for deploying Agent microservice.

agent depends on LLM service, you should set llm_endpoint_url as LLM endpoint.

## Deploy

### Use external LLM endpoint

helm install agent oci://ghcr.io/opea-project/charts/agent --set llm_endpoint_url=${YOUR_LLM_ENDPOINT}

### Deploy with tgi

helm install agent oci://ghcr.io/opea-project/charts/agent --set tgi.enabled=True

### Deploy with vllm

helm install agent oci://ghcr.io/opea-project/charts/agent --set vllm.enabled=True

## Verify

To verify the installation, run the command `kubectl get pod` to make sure all pods are running.

Then run the command `kubectl port-forward svc/agent 9090:9090` to expose the agent service for access.

Open another terminal and run the following command to verify the service if working:

```console
curl http://localhost:9090/v1/chat/completions \
-X POST \
-H 'Content-Type: application/json' \
-d '{"query":"What is OPEA?"}'
```

## Options

For global options, see Global Options.

| Key | Type | Default | Description |
| ------------------------------- | ------ | ------------------------ | ------------------------------- |
| global.HUGGINGFACEHUB_API_TOKEN | string | `""` | Your own Hugging Face API token |
| image.repository | string | `"opea/agent-langchain"` | |
| service.port | string | `"9090"` | |
| llm_endpoint_url | string | `""` | LLM endpoint |
| global.monitoring | bop; | false | Service usage metrics |
1 change: 1 addition & 0 deletions helm-charts/agent/ci-gaudi-values.yaml
38 changes: 38 additions & 0 deletions helm-charts/agent/gaudi-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

# Accelerate inferencing in heaviest components to improve performance
# by overriding their subchart values

tgi:
enabled: true
accelDevice: "gaudi"
image:
repository: ghcr.io/huggingface/tgi-gaudi
tag: "2.0.6"
resources:
limits:
habana.ai/gaudi: 4
MAX_INPUT_LENGTH: "4096"
MAX_TOTAL_TOKENS: "8192"
CUDA_GRAPHS: ""
OMPI_MCA_btl_vader_single_copy_mechanism: "none"
PT_HPU_ENABLE_LAZY_COLLECTIVES: "true"
ENABLE_HPU_GRAPH: "true"
LIMIT_HPU_GRAPH: "true"
USE_FLASH_ATTENTION: "true"
FLASH_ATTENTION_RECOMPUTE: "true"
extraCmdArgs: ["--sharded","true","--num-shard","4"]
livenessProbe:
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 1
readinessProbe:
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 1
startupProbe:
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 1
failureThreshold: 120
62 changes: 62 additions & 0 deletions helm-charts/agent/templates/_helpers.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "agent.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "agent.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}

{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "agent.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Common labels
*/}}
{{- define "agent.labels" -}}
helm.sh/chart: {{ include "agent.chart" . }}
{{ include "agent.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}

{{/*
Selector labels
*/}}
{{- define "agent.selectorLabels" -}}
app.kubernetes.io/name: {{ include "agent.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}

{{/*
Create the name of the service account to use
*/}}
{{- define "agent.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "agent.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}
66 changes: 66 additions & 0 deletions helm-charts/agent/templates/configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "agent.fullname" . }}-config
labels:
{{- include "agent.labels" . | nindent 4 }}
data:
{{- if .Values.tools }}
tools: {{ .Values.tools | quote }}
{{- end }}
{{- if .Values.llm_endpoint_url }}
llm_endpoint_url: {{ .Values.llm_endpoint_url | quote }}
{{- else }}
llm_endpoint_url: "http://{{ .Release.Name }}-tgi"
{{- end }}
# {{- if .Values.port }}
# port: {{ .Values.port | quote }}
# {{- end }}
{{- if .Values.model }}
model: {{ .Values.model | quote }}
{{- end }}
{{- if .Values.streaming }}
streaming: {{ .Values.streaming | quote }}
{{- end }}
{{- if .Values.temperature }}
temperature: {{ .Values.temperature | quote }}
{{- end }}
{{- if .Values.RETRIEVAL_TOOL_URL }}
RETRIEVAL_TOOL_URL: {{ .Values.RETRIEVAL_TOOL_URL | quote }}
{{- else }}
RETRIEVAL_TOOL_URL: "http://{{ .Release.Name }}-docretriever:8889/v1/retrievaltool"
{{- end }}
{{- if .Values.CRAG_SERVER }}
CRAG_SERVER: {{ .Values.CRAG_SERVER | quote }}
{{- else }}
CRAG_SERVER: "http://{{ .Release.Name }}-crag:8080"
{{- end }}
{{- if .Values.WORKER_AGENT_URL }}
WORKER_AGENT_URL: {{ .Values.WORKER_AGENT_URL | quote }}
{{- else }}
WORKER_AGENT_URL: "http://{{ .Release.Name }}-worker:9095/v1/chat/completions"
{{- end }}
require_human_feedback: {{ .Values.require_human_feedback | quote }}
recursion_limit: {{ .Values.recursion_limit | quote }}
llm_engine: {{ .Values.llm_engine | quote }}
strategy: {{ .Values.strategy | quote }}
max_new_tokens: {{ .Values.max_new_tokens | quote }}
{{- if .Values.OPENAI_API_KEY }}
OPENAI_API_KEY: {{ .Values.OPENAI_API_KEY | quote }}
{{- end }}
HUGGINGFACEHUB_API_TOKEN: {{ .Values.global.HUGGINGFACEHUB_API_TOKEN | quote }}
HF_HOME: "/tmp/.cache/huggingface"
{{- if .Values.global.HF_ENDPOINT }}
HF_ENDPOINT: {{ .Values.global.HF_ENDPOINT | quote }}
{{- end }}
http_proxy: {{ .Values.global.http_proxy | quote }}
https_proxy: {{ .Values.global.https_proxy | quote }}
{{- if and (not .Values.TGI_LLM_ENDPOINT) (or .Values.global.http_proxy .Values.global.https_proxy) }}
no_proxy: "{{ .Release.Name }}-tgi,{{ .Values.global.no_proxy }}"
{{- else }}
no_proxy: {{ .Values.global.no_proxy | quote }}
{{- end }}
LOGFLAG: {{ .Values.LOGFLAG | quote }}
100 changes: 100 additions & 0 deletions helm-charts/agent/templates/deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "agent.fullname" . }}
labels:
{{- include "agent.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
{{- include "agent.selectorLabels" . | nindent 6 }}
template:
metadata:
{{- with .Values.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "agent.selectorLabels" . | nindent 8 }}
spec:
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: {{ .Release.Name }}
envFrom:
- configMapRef:
name: {{ include "agent.fullname" . }}-config
{{- if .Values.global.extraEnvConfig }}
- configMapRef:
name: {{ .Values.global.extraEnvConfig }}
optional: true
{{- end }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
{{- if .Values.image.pullPolicy }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
{{- end }}
ports:
- name: agent
containerPort: 9090
protocol: TCP
volumeMounts:
{{- if .Values.toolPath }}
- mountPath: /home/user/tools
name: tool
{{- end }}
- mountPath: /tmp
name: tmp
{{- if .Values.livenessProbe }}
livenessProbe:
{{- toYaml .Values.livenessProbe | nindent 12 }}
{{- end }}
{{- if .Values.readinessProbe }}
readinessProbe:
{{- toYaml .Values.readinessProbe | nindent 12 }}
{{- end }}
{{- if .Values.startupProbe }}
startupProbe:
{{- toYaml .Values.startupProbe | nindent 12 }}
{{- end }}
resources:
{{- toYaml .Values.resources | nindent 12 }}
volumes:
{{- if .Values.toolPath }}
- name: tool
hostPath:
path: {{ .Values.toolPath }}
type: Directory
{{- end }}
- name: tmp
emptyDir: {}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if .Values.evenly_distributed }}
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
{{- include "agent.selectorLabels" . | nindent 14 }}
{{- end }}
18 changes: 18 additions & 0 deletions helm-charts/agent/templates/service.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v1
kind: Service
metadata:
name: {{ include "agent.fullname" . }}
labels:
{{- include "agent.labels" . | nindent 4 }}
spec:
type: {{ .Values.service.type }}
ports:
- port: {{ .Values.service.port }}
targetPort: 9090
protocol: TCP
name: agent
selector:
{{- include "agent.selectorLabels" . | nindent 4 }}
Loading
Loading