Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

task: Adding maintenance operator charts [WIP] #1191

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,8 @@ RUN mkdir crds && \
cp -r network-operator-chart/crds /workspace/crds/network-operator/ && \
cp -r network-operator-chart/charts/sriov-network-operator/crds /workspace/crds/sriov-network-operator/ && \
cp -r network-operator-chart/charts/node-feature-discovery/crds /workspace/crds/node-feature-discovery/ && \
cp -r network-operator-chart/charts/nic-configuration-operator-chart/crds /workspace/crds/nic-configuration-operator/
cp -r network-operator-chart/charts/nic-configuration-operator-chart/crds /workspace/crds/nic-configuration-operator/ && \
cp -r network-operator-chart/charts/maintenance-operator-chart/crds /workspace/crds/maintenance-operator/

# Build
ARG ARCH
Expand Down
4 changes: 4 additions & 0 deletions deployment/network-operator/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,7 @@ dependencies:
name: nic-configuration-operator-chart
repository: ''
version: 0.0.1
- condition: maintenanceOperator.enabled
name: maintenance-operator-chart
repository: ''
version: 0.1.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be: "0.0.1"

Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: v2
name: maintenance-operator-chart
description: Maintenance Operator Helm Chart
type: application
version: 0.0.1
appVersion: "latest"
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# maintenance-operator-chart

![Version: 0.0.1](https://img.shields.io/badge/Version-0.0.1-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: latest](https://img.shields.io/badge/AppVersion-latest-informational?style=flat-square)

Maintenance Operator Helm Chart

## Values

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| imagePullSecrets | list | `[]` | image pull secrets for the operator |
| metricsService | object | `{"ports":[{"name":"https","port":8443,"protocol":"TCP","targetPort":"https"}],"type":"ClusterIP"}` | metrics service configurations |
| operator.admissionController.certificates.certManager.enable | bool | `true` | use cert-manager for certificates |
| operator.admissionController.certificates.certManager.generateSelfSigned | bool | `true` | generate self-signed certificiates with cert-manager |
| operator.admissionController.certificates.custom.enable | bool | `false` | enable custom certificates using secrets |
| operator.admissionController.certificates.secretNames.operator | string | `"operator-webhook-cert"` | secret name containing certificates for the operator admission controller |
| operator.admissionController.enable | bool | `true` | enable admission controller of the operator |
| operator.affinity | object | `{"nodeAffinity":{"preferredDuringSchedulingIgnoredDuringExecution":[{"preference":{"matchExpressions":[{"key":"node-role.kubernetes.io/master","operator":"Exists"}]},"weight":1},{"preference":{"matchExpressions":[{"key":"node-role.kubernetes.io/control-plane","operator":"Exists"}]},"weight":1}]}}` | node affinity for the operator |
| operator.image.imagePullPolicy | string | `nil` | image pull policy for the operator image |
| operator.image.repository | string | `"ghcr.io/mellanox/maintenance-operator"` | repository to use for the operator image |
| operator.image.tag | string | `nil` | image tag to use for the operator image |
| operator.nodeSelector | object | `{}` | node selector for the operator |
| operator.replicas | int | `1` | operator deployment number of repplicas |
| operator.resources | object | `{"limits":{"cpu":"500m","memory":"128Mi"},"requests":{"cpu":"10m","memory":"64Mi"}}` | specify resource requests and limits for the operator |
| operator.serviceAccount.annotations | object | `{}` | set annotations for the operator service account |
| operator.tolerations | list | `[{"effect":"NoSchedule","key":"node-role.kubernetes.io/master","operator":"Exists"},{"effect":"NoSchedule","key":"node-role.kubernetes.io/control-plane","operator":"Exists"}]` | toleration for the operator |
| operatorConfig | object | `{"logLevel":"info","maxNodeMaintenanceTimeSeconds":null,"maxParallelOperations":null,"maxUnavailable":null}` | operator configuration values. fields here correspond to fields in MaintenanceOperatorConfig CR |
| operatorConfig.logLevel | string | `"info"` | log level configuration |
| operatorConfig.maxNodeMaintenanceTimeSeconds | string | `nil` | max time for node maintenance |
| operatorConfig.maxParallelOperations | string | `nil` | max number of parallel operations |
| operatorConfig.maxUnavailable | string | `nil` | max number of unavailable nodes |
| webhookService | object | `{"ports":[{"port":443,"protocol":"TCP","targetPort":9443}],"type":"ClusterIP"}` | webhook service configurations |

Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.15.0
name: maintenanceoperatorconfigs.maintenance.nvidia.com
spec:
group: maintenance.nvidia.com
names:
kind: MaintenanceOperatorConfig
listKind: MaintenanceOperatorConfigList
plural: maintenanceoperatorconfigs
singular: maintenanceoperatorconfig
scope: Namespaced
versions:
- name: v1alpha1
schema:
openAPIV3Schema:
description: MaintenanceOperatorConfig is the Schema for the maintenanceoperatorconfigs
API
properties:
apiVersion:
description: |-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type: string
kind:
description: |-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
type: string
metadata:
type: object
spec:
description: MaintenanceOperatorConfigSpec defines the desired state of
MaintenanceOperatorConfig
properties:
logLevel:
default: info
description: LogLevel is the operator logging level
enum:
- debug
- info
- error
type: string
maxNodeMaintenanceTimeSeconds:
default: 1600
description: |-
MaxNodeMaintenanceTimeSeconds is the time from when a NodeMaintenance is marked as ready (phase: Ready)
until the NodeMaintenance is considered stale and removed by the operator.
should be less than idle time for any autoscaler that is running.
default to 30m (1600 seconds)
format: int32
minimum: 0
type: integer
maxParallelOperations:
anyOf:
- type: integer
- type: string
default: 1
description: |-
MaxParallelOperations indicates the maximal number nodes that can undergo maintenance
at a given time. 0 means no limit
value can be an absolute number (ex: 5) or a percentage of total nodes in the cluster (ex: 10%).
absolute number is calculated from percentage by rounding up.
defaults to 1. The actual number of nodes that can undergo maintenance may be lower depending
on the value of MaintenanceOperatorConfigSpec.MaxUnavailable.
x-kubernetes-int-or-string: true
maxUnavailable:
anyOf:
- type: integer
- type: string
description: |-
MaxUnavailable is the maximum number of nodes that can become unavailable in the cluster.
value can be an absolute number (ex: 5) or a percentage of total nodes in the cluster (ex: 10%).
absolute number is calculated from percentage by rounding up.
by default, unset.
new nodes will not be processed if the number of unavailable node will exceed this value
x-kubernetes-int-or-string: true
type: object
type: object
served: true
storage: true
Loading
Loading