Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cassandra cannot become ready after configuration change #695

Open
kos-team opened this issue Aug 29, 2024 · 1 comment
Open

Cassandra cannot become ready after configuration change #695

kos-team opened this issue Aug 29, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@kos-team
Copy link

kos-team commented Aug 29, 2024

What happened?

We tried to change the configuration of an existing Cassandra cluster, by changing the cassandra-yaml.num_tokens from 16 to 8.
The operator proceeds to update the StatefulSets of the racks, by changing the arguments of the Cassandra Pods.
However, the restarted Pods are stuck at Unready state. The readiness probe keeps returning 500 errors.

What did you expect to happen?

The operator should be able to update the Cassandra configuration correctly.
A restart of the Pod is not the right procedure for changing the num_tokens of Cassandra. To properly change the num_tokens, the operator needs to decommission the node and let the node to rejoin the cluster with the updated num_tokens configuration.

How can we reproduce it (as minimally and precisely as possible)?

  1. Deploy the cass-operator
  2. Deploy CassandraDB with the following CR yaml
apiVersion: cassandra.datastax.com/v1beta1
kind: CassandraDatacenter
metadata:
  name: test-cluster
spec:
  clusterName: development
  serverType: cassandra
  serverVersion: "4.1.2"
  managementApiAuth:
    insecure: {}
  size: 3
  storageConfig:
    cassandraDataVolumeClaimSpec:
      storageClassName: standard
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi
  racks:
    - name: rack1
  config:
    jvm-server-options:
      initial_heap_size: "1G"
      max_heap_size: "1G"
    cassandra-yaml:
      num_tokens: 16
      authenticator: PasswordAuthenticator
      authorizer: CassandraAuthorizer
      role_manager: CassandraRoleManager
  1. Change the cassandra-yaml.num_tokens from 16 to 8 in config:
apiVersion: cassandra.datastax.com/v1beta1
kind: CassandraDatacenter
metadata:
  name: test-cluster
spec:
  clusterName: development
  serverType: cassandra
  serverVersion: "4.1.2"
  managementApiAuth:
    insecure: {}
  size: 3
  storageConfig:
    cassandraDataVolumeClaimSpec:
      storageClassName: standard
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi
  racks:
    - name: rack1
  config:
    jvm-server-options:
      initial_heap_size: "1G"
      max_heap_size: "1G"
    cassandra-yaml:
      num_tokens: 8
      authenticator: PasswordAuthenticator
      authorizer: CassandraAuthorizer
      role_manager: CassandraRoleManager
  users:
  - secretName: demo-secret
    superuser: true
  1. Observe the restarted Pods keep returning 500 errors to readiness probe

cass-operator version

1.22.0

Kubernetes version

1.29.1

Method of installation

Helm

Anything else we need to know?

No response

┆Issue is synchronized with this Jira Story by Unito
┆Issue Number: CASS-2

@kos-team kos-team added the bug Something isn't working label Aug 29, 2024
@burmanm
Copy link
Contributor

burmanm commented Aug 30, 2024

num_tokens modification is not allowed by Cassandra itself, so this feature is not about cass-operator.

The proper way to modify num_tokens would be to create separate datacenter as having different number on different nodes is not really recommended.

Automated decommission on config change is not currently on our radar as a feature as that could cause unintended data loss and availability issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Status: No status
Development

No branches or pull requests

3 participants