Skip to content

Latest commit

 

History

History
398 lines (293 loc) · 15.8 KB

File metadata and controls

398 lines (293 loc) · 15.8 KB

Install the a8s Control Plane

Prerequisites

Configure Backups Store

a8s supports taking backups of data service instances (DSIs). Currently, the backups are stored in an AWS S3 bucket, so before installing a8s you must create an AWS S3 bucket that a8s will use to store backups (here is the official S3 documentation).

Then, create a secret access key for the bucket. This is the key the a8s control plane will use to interact with the bucket.

After you've created the access key, you must place the information about the S3 bucket in some files as shown in the following commands. When you'll execute the commands to install a8s, the content of such files will be used to populate configmaps and secrets that the a8s control plane will read to be able to upload and download backups from S3. In order to encrypt the backups you also have to configure an encryption password. You can do so by inserting your desired encryption password into the deploy/a8s/backup-config/encryption-password file. You MUST use the file names shown in the subsequent commands.

# create file that stores the ID of the key
echo -n <bucket-access-key-id> > deploy/a8s/backup-config/access-key-id

# create file that stores the secret value of the key
echo -n <bucket-secret-access-key> > deploy/a8s/backup-config/secret-access-key

# create file that stores password for backup encryption
echo -n <encryption password> > deploy/a8s/backup-config/encryption-password

# create file with other information about the bucket
cp deploy/a8s/backup-config/backup-store-config.yaml.template deploy/a8s/backup-config/backup-store-config.yaml 

Then, use an editor to open deploy/a8s/backup-config/backup-store-config.yaml and replace the value:

  • of the container field with the name of the S3 bucket
  • of the region field with the name of the region where the bucket is located

All the created files are gitignored so you don't have to worry about committing them by mistake (since they contain private data).

Configure Images

The images the framework uses to create the Data Service Instances can be configured. A ConfigMap with the default values is provided at deploy/a8s/manifests/postgresql-images.yaml. If you need to use different images, or want to mirror them to an internal repository, you can edit the config, or overwrite it via Kustomize.

Currently the following images can be configured:

Key Description
spiloImage Image of spilo, which provides PostgreSQL and patroni for HA
backupAgentImage Image of the a9s backup agent, which performs logical backups and restores

Please note: The images will change over time, as we upgrade our framework components. If the defaults have been changed, they should be updated when we update the images.

Also, your changes will be overwritten when deploying with the OLM and during an update. If you need to edit the configMap, reapply it when you deployed or updated the framework. In this case you might want to disable automatic updates.

WARNING: Issues may occure when running spilo with the MobilityDB Postgresql extension on arm-based systems. The latest spilo images natively support both amd and arm architectures, which is not the case for MobilityDB.

Install the a8s Control Plane

The a8s Control Plane can be deployed with the help of the static manifests you can find under /deploy/a8s/manifests or with the help of the Operator Lifecycle Manager (OLM).

While the manifest method is easy to use, it does not come with automatic updates or lifecycle management of the framework, so we encourage you to use the OLM.

Using Static Manifests

Install the cert-manager

The a8s framework relies on the cert-manager to generate TLS certificates, therefore you will first have to install it on your cluster.

Please check the cert-manager cloud compatibility page to ensure your Kubernetes cluster meets all the requirements to run the cert-manager.

In general there are a multitude of installation options supported by the cert-manager, for this guide we will only describe how to setup a basic deployment, since that suffices for a8s. For a production grade deployment please consult the documentation.

To setup a basic deployment, use:

kubectl apply --kustomize deploy/cert-manager

This will install all the cert-manager components that a8s needs. Know that it might take some time for the components to get up and running (roughly we've experienced 80 secs for a 3-node EKS cluster), if you install a8s before that has happened things won't work.

Currently, the few parts of a8s that require TLS use self-signed certificates. If instead you want to set up a proper Certificate Authority, please check out the configuration pages, where you can find instructions on how to do that.

Install the Control Plane with Manifests

Just run:

kubectl apply --kustomize deploy/a8s/manifests

This command will create the Kubernetes resources that make up the a8s control plane in the correct order.

More precisely, it will:

  1. Create a namespace called a8s-system. The a8s framework components postgresql-controller-manager, a8s-backup-controller-manager and service-binding-controller-manager will be running in a8s-system namespace.

  2. Register multiple CustomResourceDefinitions (CRDs)

  3. Create three deployments, one for each a8s framework component

  4. Create multiple ClusterRoles and ClusterRoleBindings:

    • <component_name>-manager-role and <component_name>-manager-rolebinding:
      provides the a8s framework components access to Kubernetes resources
    • <component_name>-metrics-reader:
      provide access to the metrics endpoint
    • <component_name>-proxy-role and <component_name>-proxy-rolebinding:
      used for access authentication to secure the access to metrics
    • postgresql-spilo-role:
      gives spilo the required permissions to access Kubernetes resources
  5. Create one Role and RoleBinding for each a8s framework component:

    • <component_name>-leader-election-role and <component_name>-leader-election-rolebinding:
      used for communication between multiple controllers of the same type.
      Note The a8s framework is not HA ready, therefore this ClusterRole is currently not actively used.
  6. Generate and apply multiple Configmaps (e.g. a8s-backup-store-config) and Secrets (e.g. a8s-backup-storage-credentials) that are necessary for the a8s framework in order to function properly.

It might take some time for the a8s control plane to get up and running. To know when that happens you can run the two following commands and wait until both of them show that all deployments are ready (value 1/1 under the READY column):

watch kubectl get deployment --namespace a8s-system

the output of the command should be similar to:

NAME                                 READY   UP-TO-DATE   AVAILABLE   AGE
a8s-backup-controller-manager        1/1     1            1           105s
service-binding-controller-manager   1/1     1            1           105s
postgresql-controller-manager        1/1     1            1           105s

Uninstall the Control Plane with Manifests

To uninstall the control plane, use:

kubectl delete --kustomize deploy/a8s/manifests

This will delete the Kubernetes resources that make up the a8s control plane and their associated CRDs.

To uninstall the cert-manager components, use:

kubectl delete --kustomize deploy/cert-manager

Using the OLM

Install the OLM

If you have the operator-sdk CLI already installed, you can use

operator-sdk olm install

to install the OLM components to your cluster. Alternatively, you can follow the official instructions.

Install the Control Plane with OLM

To install the a8s control plane use:

kubectl apply --kustomize deploy/a8s/olm

to apply all OLM resources necessary. In more detail, this will create:

  • the namespace a8s-system.
  • a CatalogSource referencing the a8s catalog which contains references to our operators
  • an OperatorGroup a8s-operators linked to the a8s-system namespace, which can be used to adjust general permission for all operators in that group. You can find more information on that subject here.
  • a Subscription to the a8s postgresql-operator. A Subscription indicates your desire to have the operator installed to the cluster, the OLM will then fetch the bundle of the PostgreSQL operator and its dependencies, which includes the a8s-backup-manager and a8s-service-binding-controller. These bundles then contain the instructions for the OLM to create the same resources as explained in the manifest section.

Additionally, the kustomization creates the secret and configMap for the backup bucket configuration.

Uninstalling the Control Plane with OLM

To uninstall the control plane use:

kubectl delete --kustomize deploy/a8s/olm

This will delete the backup credentials, the subscriptions and therefore also the control plane deployment, it does not delete the CRDs from the cluster. The OLM keeps the CRDs because the deletion would cause also the deletion of the CRs and therefore all instances and also backup objects. In the OLM documentation it is therefore stated that such a step should only be taken deliberately by a user. You can delete the CRDs using:

kubectl delete crd recoveries.backups.anynines.com\
    backups.backups.anynines.com\
    postgresqls.postgresql.anynines.com\
    servicebindings.servicebindings.anynines.com

(Optional) Install the Logging Infrastructure

This repo also comes with yaml manifests that you can use to optionally install components to collect and visualize logs of the provisioned data service instances.

More precisely, these are:

  1. A Fluent Bit daemonset where each node-local daemon collects the logs of the Pods on its node.
  2. A FluentD aggregator that collects and aggregates the logs from the Fluent Bit daemonset.
  3. OpenSearch and OpenSearch Dashboards to query and visualize the logs.

To install them, simply run:

kubectl apply --kustomize deploy/logging

To wait for all components to be up and running, run:

watch kubectl get pod --namespace a8s-system --selector=a8s.anynines/logging

and wait until you see that all the pods (3 + the number of worker nodes of your Kubernetes cluster) are running (value 1/1 under the READY column):

NAME                                         READY   STATUS    RESTARTS   AGE
a8s-fluentd-aggregator-0                     1/1     Running   0          6m20s
a8s-opensearch-cluster-0                     1/1     Running   0          6m20s
a8s-opensearch-dashboards-648cb7d4f4-6xmq8   1/1     Running   0          6m20s
fluent-bit-jqfgl                             1/1     Running   0          6m20s

Virtual Memory Usage

OpenSearch heavily rely on virtual memory usage (so mmap). When applying the logging framework, you might have to adjust the mmap limit on your nodes, otherwise the OpenSearch pods will fail, with the error message:

ERROR: [1] bootstrap checks failed [1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

If you are running the framework on something like minikube or kind using Docker, this does not apply. Otherwise, you can find out more on how to adjust the virtual memory in the OpenSearch documentation on this topic.

Disabling Virtual Memory Usage

Important Note: The documentation explicitly warns you to not use this setup in production grade workloads.

If you just want to experiment with the framework, or you do not have privileged access to the nodes, to adjust the virtual memory, you can disable the usage in the OpenSearch configuration. For that set the allow_mmap flag in the OpenSearch configuration, located in deploy/logging/dashboard/config/opensearch.yaml, to false by appending:

node:
  store:
    allow_mmap: false

After applying the change and restarting the a8s-opensearch-cluster-0 pod by using

kubectl delete pod a8s-opensearch-cluster-0 -n a8s-system

OpenSearch should now work without issues.

Uninstall the Logging Infrastructure

Run:

kubectl delete --kustomize deploy/logging

(Optional) Install the Metrics Infrastructure

Just as for logging this repo includes yaml manifests that can be used to optionally install components to collect and visualize metrics of the provisioned data service instances. As of now we do not isolate tenants which means that everyone with access to the Prometheus instance can see both the system metrics of the Kubernetes control plane as well all metrics that are scraped from data service instances.

These include:

  1. A cluster-level Prometheus deployment that scrapes both the Kubernetes system metrics as well as the data service instances.
  2. A Grafana dashboard to query and visualize the metrics.

To install them, simply run:

kubectl apply --recursive --filename deploy/metrics/

To wait for all components to be up and running, run:

watch kubectl get pod --namespace a8s-system --selector=a8s.anynines/metrics

and wait until you see that the Prometheus and Grafana Pods are running (value 1/1 under the READY column):

NAME                                        READY   STATUS    RESTARTS   AGE
a8s-system    pod/grafana-64c89f57f7-v7wlp                1/1     Running   0          111s
a8s-system    pod/prometheus-deployment-87cc8fb88-225v4   1/1     Running   0          111s

Uninstall the Metrics Infrastructure

Run:

kubectl delete --recursive --filename deploy/metrics/