teutonet · cwrau · Aug 5, 2024 · Aug 7, 2024 · Aug 9, 2024
@@ -0,0 +1,25 @@
+# Patterns to ignore when building packages.
+# This supports shell glob matching, relative path matching, and
+# negation (prefixed with !). Only one pattern per line.
+.DS_Store
+# Common VCS dirs
+.git/
+.gitignore
+.bzr/
+.bzrignore
+.hg/
+.hgignore
+.svn/
+# Common backup files
+*.swp
+*.bak
+*.tmp
+*.orig
+*~
+# Various IDEs
+.project
+.idea/
+*.tmproj
+.vscode/
+README.md.gotmpl
+.helmignore
@@ -0,0 +1,6 @@
+dependencies:
+- name: common
+  repository: oci://ghcr.io/teutonet/teutonet-helm-charts
+  version: 1.2.0
+digest: sha256:62ef92fb03b60b1bf481b96b8b856f3b3156c10cc50a50e3604c8b679ef71497
+generated: "2024-07-29T11:41:05.632726065+02:00"
@@ -0,0 +1,20 @@
+apiVersion: v2
+name: anynines-klutch
+type: application
+version: 0.1.0
+icon: https://docs.k8s.anynines.com/img/favicon.ico
+maintainers:
+  - name: cwrau
+    email: [email protected]
+  - name: marvinWolff
+    email: [email protected]
+  - name: tasches
+    email: [email protected]
+sources:
+  - https://docs.k8s.anynines.com/docs/platform-operator/central-management-cluster-setup
+home: https://teuto.net
+description: Installs the anynines klutch platform
+dependencies:
+  - name: common
+    version: 1.2.0
+    repository: oci://ghcr.io/teutonet/teutonet-helm-charts
@@ -0,0 +1,313 @@
+[modeline]: # ( vim: set ft=markdown: )
+{{ template "chart.header" . }}
+{{ template "chart.deprecationWarning" . }}
+
+{{ template "chart.badgesSection" . }}
+
+{{ template "chart.description" . }}
+
+{{ template "chart.homepageLine" . }}
+
+{{ template "chart.maintainersSection" . }}
+
+## Cluster bootstrap
+
+```sh
+# always be git 😁
+git init
+
+# create empty cluster HelmRelease;
+flux create helmrelease --export base-cluster -n flux-system --source HelmRepository/teuto-net.flux-system --chart base-cluster --chart-version 5.x.x > cluster.yaml
+
+# maybe use the following name for your cluster;
+kubectl get node -o jsonpath='{.items[0].metadata.annotations.cluster\.x-k8s\.io/cluster-name}'
+
+# configure according to your needs, at least `.global.clusterName` is needed
+# additionally, you should add your git repo to `.flux.gitRepositories`, see [the documentation](https://github.com/teutonet/teutonet-helm-charts/tree/main/charts/base-cluster#81--property-base-cluster-configuration--flux--gitrepositories)
+# make sure to use the correct url format, see [the documentation](https://github.com/teutonet/teutonet-helm-charts/tree/main/charts/base-cluster#81112-property-base-cluster-configuration--flux--gitrepositories--additionalproperties--allof--item-0--oneof--item-1)
+vi cluster.yaml
+
+# create HelmRelease for flux to manage itself
+kubectl create namespace flux-system --dry-run=client -o yaml > flux.yaml
+flux create source helm --url https://fluxcd-community.github.io/helm-charts flux -n flux-system --export >> flux.yaml
+flux create helmrelease --export flux -n flux-system --source HelmRepository/flux.flux-system --chart flux2 --chart-version 2.x.x >> flux.yaml
+
+# add, commit and push resources
+git add cluster.yaml flux.yaml
+git commit cluster.yaml flux.yaml
+git push
+
+# after this you should be on the KUBECONFIG for the cluster
+# we explicitly do not use `flux bootstrap` or `flux install` as this creates kustomization stuff and installs flux manually
+kubectl apply --server-side -f flux.yaml # ignore the errors about missing CRDs
+helm install -n flux-system flux flux2 --repo https://fluxcd-community.github.io/helm-charts --version 2.x.x --atomic
+
+# manual initial installation of the chart, afterwards the chart takes over
+# after the installation finished, follow the on-screen instructions to configure your flux, distribute KUBECONFIGs, ...
+helm install -n flux-system base-cluster oci://ghcr.io/teutonet/teutonet-helm-charts/base-cluster --version 4.x.x --atomic --values <(cat cluster.yaml | yq -y .spec.values)
+
+# you can use this command to get the instructions again
+# e.g. when adding users, gitRepositories, ...
+helm -n flux-system get notes base-cluster
+```
+
+> ⚠️  Due to various reasons, it's not possible to cleanly uninstall this
+via a normal `kubectl delete`, `helm uninstall` or via flux deletion.
+[See the corresponding issue](https://github.com/teutonet/teutonet-helm-charts/issues/28)
+
+## Cluster components
+
+### Component [backup](#backup)
+
+[velero](https://velero.io) takes care of backing up your PVCs.
+
+### Component [cert-manager](#certManager)
+
+[cert-manager](https://cert-manager.io) takes care of creating SSL certificates
+for your Ingresses (and [other needs](https://cert-manager.io/docs/usage))
+
+1. set `.certManager.email` to your email for the Let's Encrypt account to enable
+   certificates
+
+To create wildcard certificates, you need to enable a [DNS Provider](#component-dns)
+
+Then you can just create a [`Certificate`](https://cert-manager.io/docs/usage/certificate)
+resource.
+
+### Component [descheduler](#descheduler)
+
+The [descheduler](https://github.com/kubernetes-sigs/descheduler) runs periodically
+and tries to average the load across the nodes by deleting pods on fuller nodes
+so the kube-scheduler can, hopefully, schedule them on nodes with more space.
+
+Additionally, the descheduler also tries to reconcile `topologySpreadConstraints`
+and affinities.
+
+If the cluster is _semi_ underspecced or the individual applications have unperfect
+resource requests, the descheduler might lead to period restarting of random pods.
+
+In that case you should disable the descheduler.
+
+### Component [dns](#dns)
+
+The [external-dns](https://github.com/kubernetes-sigs/external-dns) creates, updates,
+deletes and syncs DNS records for your Ingresses.
+
+1. set `.dns.provider.<provider>` to your implementation:
+    - cloudflare: `.dns.provider.cloudflare.apiToken`
+
+If you need a different provider than cloudflare, please open a ticket for one of
+the [supported ones](https://github.com/kubernetes-sigs/external-dns#status-of-providers)
+which is also supported by [cert-manager](https://cert-manager.io/docs/configuration/acme/dns01/#supported-dns01-providers)
+
+### Component [ingress](#ingress)
+
+The included [`nginx` ingress-controller](https://docs.nginx.com/nginx-ingress-controller)
+only works for the `IngressClassName: nginx`.
+
+#### TLS
+
+1. add `kubernetes.io/tls-acme: "true"` to your Ingress's annotations
+    - additionally, although not advised unless you know what you're doing,
+      you can explicitly choose the Issuer by using these annotations:
+      - `cert-manager.io/cluster-issuer: letsencrypt-staging`
+      - `cert-manager.io/cluster-issuer: letsencrypt-production`
+
+#### IP Address
+
+If you want to make sure that, in the event of a catastrophic failure, you keep the
+same IP address, you should roll this out, get the assigned IP
+(`kubectl -n ingress-nginx get svc ingress-nginx-controller -o jsonpath='{.status.loadBalancer.ingress}'`)
+and set `.ingress.IP=<ip>` in the values. This makes sure the IP is kept in your
+project (may incur cost!), which means you can reuse it later or after recovery.
+
+### Component [flux](#flux)
+
+[Flux](https://fluxcd.io) is used to deploy resources to your cluster and to
+keep them in sync.
+
+Flux can also auto-update images and HelmReleases.
+
+You can create any number of gitRepository connections, with SSH(recommended)
+or https checkout, with or without SOPS, ... .
+
+### Component [kyverno](#kyverno)
+
+You can optionally enable [kyverno](https://kyverno.io), which is a policy
+system, allowing you to specify in-depth policies to prevent or force certain
+things in your cluster.
+
+### Component [monitoring](#monitoring)
+
+#### Sub-Component [prometheus](#monitoring_prometheus)
+
+[Prometheus](https://prometheus.io) takes care of scraping metrics and alerting.
+
+#### Sub-Component [grafana](#monitoring_grafana)
+
+[Grafana](https://grafana.com) is used to create dashboards to visualize your
+metrics and the health of your cluster and applications.
+
+#### Sub-Component [loki](#monitoring_loki)
+
+[Loki](https://grafana.com/oss/loki) collects logs from across the cluster and
+allows to have a centralized, non-CLI, view of the logs and to create alerting
+based on them.
+
+#### Sub-Component [metrics-server](#monitoring_metricsServer)
+
+[Metrics Server](https://github.com/kubernetes-sigs/metrics-server) implements
+the [kubernetes Metrics API](https://github.com/kubernetes/metrics) to allow
+for [Horizontal Autoscaling](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale),
+[Vertical Autoscaling](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler)
+and to allow `kubectl top` and tools like [k9s](https://k9scli.io) to show
+resource usage for your pods and nodes.
+
+#### Sub-Component [securityScanning](#monitoring_securityScanning)
+
+The included [trivy](https://aquasecurity.github.io/trivy-operator) scans the
+running workload in your cluster for CVEs and creates
+[Custom Resources](https://aquasecurity.github.io/trivy-operator/v0.12.1/docs/crds)
+to present the results.
+
+#### Sub-Component [tracing](#monitoring_tracing)
+
+The included [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/)
+collects traces via otlp-grpc on every node via the `open-telemetry-collector-opentelemetry-collector.monitoring` service.
+These traces are then sent to [Grafana Tempo](https://grafana.com/oss/tempo/),
+which is included as a datasource in Grafana by default.
+
+##### Usage Example
+
+In your deployment/statefulset/daemonset/... add the following config;
+
+```yaml
+spec:
+  template:
+    spec:
+      containers:
+        - env:
+            - name: OTEL_HOST <- change this to your framework's environment variable
+              value: open-telemetry-collector-opentelemetry-collector.monitoring
+            - name: OTEL_PORT
+              value: "4317"
+```
+
+The supported protocols are;
+
+- jaeger
+  - grpc: 14250
+  - thrift_http: 14268
+  - thrift_compact: 6831
+- otlp
+  - grpc: 4317
+  - http: 4318
+- zipkin: 9411
+
+### Component [storage](#storage)
+
+The included [NFS Ganesha server and external provisioner](https://github.com/kubernetes-sigs/nfs-ganesha-server-and-external-provisioner)
+provides rudimentary support for RWM volumes if needed.
+
+> ⚠️  This is _not_ highly available, and the software itself _does not_ support
+it. You should only use this if there is no other choice and make sure you're
+cloud provider knows about this, because a node rotation _will_ result in a
+downtime for all attached applications!
+
+### Component [rbac](#rbac)
+
+This chart gives you the ability to create serviceAccounts, roles, roleBindings,
+[namespaces](#miscellaneous) and KUBECONFIG files with a, hopefully easy to
+understand, DSL.
+
+After configuring your stuff you can fetch the KUBECONFIGs with the help of the
+output of `helm -n flux-system get notes base-cluster`
+
+### Miscellaneous
+
+- You can create [`HelmRepositoy`s](#global); `.global.helmRepositories.<name>.url=<url>`
+- You can create [cluster-wide certificates](#global_certificates); `.global.certificates.<name>.dnsNames=[<domain>]`
+- You can create [namespaces](#global_namespaces); `.global.namespaces.<name>={}`
+- You can create [cluster-wide imageCredentials](#global); `.global.imageCredentials.<name>.{host,username,password}`
+
+{{ template "chart.sourcesSection" . }}
+
+{{ template "chart.requirementsSection" . }}
+
+This helm chart requires [flux v2 to be installed](https://fluxcd.io/docs/installation),
+see [bootstrap](#cluster-bootstrap)
+
+The various components are automatically updated to the latest minor and patch version.
+
+This excludes:
+
+- descheduler, its version is bound to the k8s version and they have not released
+  1.0.0
+
+## Migration
+
+### 0.x.x -> 1.0.0
+
+- The field `.dns.email` moves to `.certManager.email`.
+- The field `.dns.provider.cloudflare.email` is removed, as only `apiToken`s are
+  supported anyways.
+
+### 1.x.x -> 2.0.0
+
+⚠️  Skip this migration!
+
+- Flux is now a direct dependency
+  - You should add the following labels to all resources of flux;
+    - .metadata.labels["app.kubernetes.io/managed-by"]="Helm"
+    - .metadata.annotations["meta.helm.sh/release-name"]="base-cluster"
+    - .metadata.annotations["meta.helm.sh/release-namespace"]="flux-system"
+  - If you have problems when applying / `helm upgrade`ing the new CRDs, like
+    `cannot patch "alerts.notification.toolkit.fluxcd.io" with kind
+     CustomResourceDefinition: CustomResourceDefinition.apiextensions.k8s.io
+     "alerts.notification.toolkit.fluxcd.io" is invalid: status.storedVersions[1]:
+     Invalid value: "v1beta2": must appear in spec.versions`, you can replace
+     those CRDs. (kubectl replace --force -f -)
+    - ⚠️ make sure to only replace CRDs you're not actively using!!, this is
+      a destructive operation. If all your resources are in flux you can also
+      try to turn off flux before the replacement and flux _should_ resync and
+      reconcile all resources.
+  - remove your manually managed flux resources
+
+### 2.x.x -> 3.0.0
+
+- Flux is removed as a direct dependency
+
+  The flux chart is way too unstable, cannot be used for an installation, ...
+
+We be sorry 😥
+
+You're gonna have to install flux yourself again
+
+### 3.x.x -> 4.0.0
+
+The storageClasses are going to be removed from this chart, this is prepared by
+leaving them in the cluster on upgrade.
+
+The new [t8s-cluster](../t8s-cluster) is going to provide these, the enduser can
+ignore this change.
+
+### 4.x.x -> 5.0.0
+
+The condition if velero gets deployed changed. Velero will not be deployed if you
+have not configured its backupstoragelocation. This change is necessary, because
+in the current version of velero this value is mandatory. Please move
+your existing backupstoragelocation configuration to the base-cluster chart if you
+haven't already.
+
+### 5.x.x -> 6.0.0
+
+The kyverno 2.x.x -> 3.x.x upgrade cannot be done without manual intervention, see
+https://artifacthub.io/packages/helm/kyverno/kyverno#option-1---uninstallation-and-reinstallation
+
+So you have to backup your resources and delete the kyverno HelmReleases before the
+upgrade, they will be recreated in version 6.
+
+This also makes kyverno HA, so be aware that kyverno will need more resources in
+you cluster.
+
+{{ .Files.Get "values.md"  }}
@@ -0,0 +1,4 @@
+backupManager:
+  url: https://sb.test.com:3000
+  username: admin
+  password: admin
@@ -0,0 +1,20 @@
+dataservices:
+  postgresql:
+    url: https://pg.sb.a9s.cwrau.wtf
+    username: admin
+    password: fvGeQ8AIclWj1tN8ViiHghtLROQX9c
+
+backupManager:
+  url: https://backups.a9s.cwrau.wtf
+  username: admin
+  password: FkVFcrX8Fow4ELuwkVhjeURKeocX8I
+
+oidc:
+  ingress:
+    host: oidc.a9s.cwrau.wtf
+
+ingress:
+  host: klutch.a9s.cwrau.wtf
+
+kubernetes:
+  externalAddress: https://212.15.214.137:6443