So far we used helm and skaffold, but we used it with our own helm chart. However, in many cases, you won't have to do that as a helm chart for the software you want to use is already out there.
Often, for some software there are multiple helm charts available. You will need to choose which one. Choose wisely:
- Take one that is still maintained (recent versions, activity on github)
- If possible take one from the authors of the software. If not, take one from a reputable organisation.
- When taking a helm chart from github, don't just take master branch, but take a released version.
- Have a look at the open issues. The helm chart might have a security issue or might be incompatible with your version of kubernetes.
Helm charts can be published in git repo's. To install them, clone the git repo, go to the right folder, look at the values you want to override and install away. This is actually the same process we used for our own helm chart.
There is actually another way to install helm charts: using helm repositories. A helm repository is just a collection of zipped up helm charts somewhere on the web. A helm repository has one file that should always be present: index.yaml
. This file contains the list of all helm charts on that repository.
Excercise: head over to https://github.com/Kapernikov/cvat-helm and try to install CVAT from the helm chart. Follow the instructions on the website.
Some small tasks:
- cvat is a web application, just like the one from our previous excercise. A web application needs a public facing url, and since our cluster is running locally, we need to fake this using the
/etc/hosts
trick we did before. Can you set up CVAT in a way that you can access it from your browser ? - The cvat helm chart has some configuration options. Can you change them after already having installed CVAT?
- We don't need CVAT anymore, can you cleanly uninstall it ?
Helm charts are nice, but they have some caveats:
- The helm installer runs once to install everything, and then it stops. If you want to change values after installation, you need to perform an upgrade.
- Not everything in kubernetes is modifiable. For instance, some storage providers do not support resizing of volumes. Some fields of pods/jobs/deployments are immutable after installation, in order to change them you have to recreate. Helm has no solution for this.
So while helm is very simple, for complicated stuff, sometimes you want more. And this "more" is called kubernetes operators. Let's break down what an operator is:
The kubernetes API is extensible, and a custom resource definition is the way to do this. Let's try to create a simple one. Suppose we want to create a new object type AppUser
. An app user would have a lastName, firstName and emailAddress.
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: appusers.kapernikov.com
spec:
group: kapernikov.com
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
lastName:
type: string
firstName:
type: string
emailAddress:
type: string
scope: Namespaced
names:
plural: appusers
singular: appuser
kind: AppUser
shortNames:
- au
now we have this definition we can create an AppUser
like so:
apiVersion: kapernikov.com/v1
kind: AppUser
metadata:
name: john
spec:
lastName: Doe
firstName: John
emailAddress: [email protected]
When we create an appuser, kubernetes reacts as we expect: it stores the appuser (in etcd), and you can access it using the kubectl
command or with k9s, just like you would with any other kubernetes command.
But for the rest, kubernetes does exactly nothing. It doesn't create a real user in your application, it doesn't launch a job or anything else. It just sits there.
Probably, when creating a real application, we would want to do something usefull with an AppUser
, for instance doing something whenever a user is created, updated or deleted.
Kubernetes makes it easy, by providing an API that does not only expose REST endpoints but also events. We can access this API using any programming language that has kubernetes bindings, or simply using kubectl
:
kubectl get appuser --watch
The --watch
flag will make kubectl wait for events and print them to the screen. Now imagine a program listening for AppUser
events and doing something useful. This program would be called a controller.
And, unlike our kubectl example, this program could just run in a kubernetes pod (why not). Off course, this pod would need special permissions to access AppUser objects in kubernetes (by default a pod cannot).
Creating these permissions would involve creating a Role
, a RoleBinding
and a ServiceAccount
, but we're not going to deal with this here.
Controllers typically use a reconcile loop. The reconcile loop will compare the current state of the cluster with the desired state of the cluster, and will create, update or delete resources as needed. They will do this every time an event comes in (this event could be a change to a custom resource, but it could also be a change of state in the cluster, eg a pod terminating). While being very resilient, reconcile loops typically are not very efficient. This is one of the reasons why an idle kubernetes cluster already uses quite a lot of resources.
Now that we have seen both controllers and custom resource definitions, we can dive into the operator pattern. An operator is basically:
- one or more custom resource definition
- some piece of software that reacts to these custom resource definitions by creating, updating or deleting resources
- this software itself is containerized and deployable to kubernetes (installing an operator could be installing the helm chart of a certain operator)
You will find lots of operators, both commercial and open source.
Let's now experiment with the zalando postgres operator. The zalando postgres operator can be installed as a helm chart. Since we already know this, let's do that: First we need to add the helm repository.
helm repo add zalando-pgo https://opensource.zalando.com/postgres-operator/charts/postgres-operator/
# list all available helm charts in this repo
helm search repo zalando-pgo
Now we can actually install this, let's create a namespace zalando-pgo
kubectl create namespace zalando-pgo
helm install -n zalando-pgo postgres-operator zalando-pgo/postgres-operator
Now the operator is installed.
Now that the zalando postgres operator is running, we can use it to make a postgres cluster
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: kapernikov-pg-cluster
spec:
teamId: "kapernikov"
volume:
size: 3Gi
numberOfInstances: 2
users:
admin:
- superuser
- createdb
testuser: []
databases:
ourdatabase: testuser
postgresql:
version: "14"
parameters:
shared_buffers: "32MB"
max_connections: "40"
archive_mode: "on"
archive_timeout: 1800s
archive_command: /bin/true
# log_statement: "all"
Let's take the above yaml and change the shared_buffers to 40mb. then see what happens when applying the update.