-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement per node Affinity & Tolerations #102
base: main
Are you sure you want to change the base?
Conversation
@glebiller Thank you for the PR. I will review it this weekend and I will let you know on Monday morning about it. Thanks for your contribution. |
@glebiller apologies that I could not reply earlier. I understand that you would like to have the option to apply specific Affinity & Tolerations rules to specific StatefulSets. Kubegres can replace a Primary Postgres StatefulSet by a Replica StatefulSet and it can also remove a Replica Postgres StatefulSet if it is identified as unavailable which will be replaced by a new Replica StatefulSet. This failover behaviour has for result the incrementation of the last instance index when assigning new names to the newly created StatefulSets. For example, let's say we have a cluster of Postgres with 3 StatefulSets, with the names: mypostgres-1, mypostgres-2 and mypostgres-3. The name "[postgres clustername]-[integer] has for instance index [integer]. In mypostgres-1 the instance index is 1. If the Primary StatefulSet mypostgres-1 is unavailable it is replaced by a Replica StatefulSet mypostgres-2 which will be promoted as a new Primary. And a new Replica StatefulSet will be created. After the failover, the cluster will contains: mypostgres-2, mypostgres-3 and mypostgres-4. The instance index 1 does not exist once the failover is completed. If a Replica StatefulSet is unavailable, let's say mypostgres-3, then it will be replaced by a new Replica StatefulSet. After the failover the cluster will contains: mypostgres-2, mypostgres-4 and mypostgres-5. The instance index 3 does not exist once the failover is completed. Correct me if I am wrong, it seems like the array |
@alex-arica No worries for the delay :) I understood the behavior you described before, and the Failover unit test was working because it was re-creating the missing Instance instead of incrementing the instance index to create a new Statefulset. I just commit the second part of the changes that does the following:
That change should also help with #88 since it will keep the name of the StatefulSets organized, in addition to being able to change the configuration of each nodes Tolerations & Affinities separately. |
Thank you for the update @glebiller I will review it as soon as I can and let you know. |
Use the Kubernetes recommanded label "app.kubernetes.io/instance" to store the StatefulSet instance. Replace the int32 index variable by the name of the StatefulSet instance defined in `nodeSets`.
Hi, will this be merged somewhere in the future? |
Currently, the Affinity & Tolerations are set for all the StatefulSets uniformly.
This prevents forcing StatefulSets to a particular region or particular node.
This PR adds a new field
spec.nodeSets
that is mutually exclusive withspec.replicas
.This new field allows setting an Affinity or Tolerations to a particular node while keeping the a default in the
spec.scheduler
.Added the required tests to cover the use cases.
NB: I am considering simplifying the logic so that the field
replicas
will generate a number of empty node in thenodeSets
to match the value. This will allow simplifying the reconciliation.