MongoDB replicaset with PS or PSS setup could not have arbiter added due to crashloop for goal state #1615

KarooolisZi · 2024-09-09T11:51:21Z

@nammn Hello, I am not able to add arbiter to 2 member replicaset. (Same for 3 member replicaset). After adding, the pod is created however operator throws error about not reaching the goal state. After some time replicaset members go down as readiness probe is failed.
What did you do to encounter the bug?
Steps to reproduce the behavior:

Changed my CR MongoDB cluster yaml manifest database.yaml spec.arbiters: 0 to spec.arbiters: 1
Use kubectl apply -f database.yaml
Checked MongoDB community operator logs. Observed the debug log stating that none of pods reached goal state.

What did you expect?
Arbiter to be added to 2 member replicaset. Both 2 members and 1 arbiter to reach goal state.

What happened instead?
Neither members or arbiter could reach goal state. Replicaset stuck in crashloop. Working members become failed after not reaching goal state for some time, too.

Operator Information

0.10.0
docker.io/mongo:6.0.17

Kubernetes Cluster Information

AWS EKS
1.28

If possible, please include:

The operator logs

2024-09-09T11:35:35.491Z	DEBUG	scram/scram.go:102	Credentials have not changed, using credentials stored in: secret/dms-user-scram-scram-credentials
2024-09-09T11:35:35.492Z	DEBUG	agent/agent_readiness.go:111	The Agent in the Pod 'mongodb-0' hasn't reached the goal state yet (goal: 30, agent: 29)	{"ReplicaSet": "mongodb-surplus/mongodb"}
2024-09-09T11:35:35.492Z	DEBUG	agent/agent_readiness.go:111	The Agent in the Pod 'mongodb-1' hasn't reached the goal state yet (goal: 30, agent: 29)	{"ReplicaSet": "mongodb-surplus/mongodb"}
2024-09-09T11:35:35.492Z	DEBUG	agent/agent_readiness.go:111	The Agent in the Pod 'mongodb-2' hasn't reached the goal state yet (goal: 30, agent: 29)	{"ReplicaSet": "mongodb-surplus/mongodb"}
2024-09-09T11:35:35.492Z	DEBUG	agent/agent_readiness.go:111	The Agent in the Pod 'mongodb-arb-0' hasn't reached the goal state yet (goal: 30, agent: -1)	{"ReplicaSet": "mongodb-surplus/mongodb"}
2024-09-09T11:35:35.492Z	DEBUG	agent/replica_set_port_manager.go:122	No port change required	{"ReplicaSet": "mongodb-surplus/mongodb"}
2024-09-09T11:35:35.492Z	DEBUG	agent/replica_set_port_manager.go:40	Calculated process port map: map[mongodb-0:27017 mongodb-1:27017 mongodb-2:27017 mongodb-arb-0:27017]	{"ReplicaSet": "mongodb-surplus/mongodb"}
2024-09-09T11:35:35.492Z	DEBUG	controllers/replica_set_controller.go:505	AutomationConfigMembersThisReconciliation	{"mdb.AutomationConfigMembersThisReconciliation()": 3}
2024-09-09T11:35:35.492Z	DEBUG	controllers/replica_set_controller.go:358	Waiting for agents to reach version 30	{"ReplicaSet": "mongodb-surplus/mongodb"}
2024-09-09T11:35:35.492Z	DEBUG	agent/agent_readiness.go:111	The Agent in the Pod 'mongodb-0' hasn't reached the goal state yet (goal: 30, agent: 29)	{"ReplicaSet": "mongodb-surplus/mongodb"}
2024-09-09T11:35:35.492Z	DEBUG	agent/agent_readiness.go:111	The Agent in the Pod 'mongodb-1' hasn't reached the goal state yet (goal: 30, agent: 29)	{"ReplicaSet": "mongodb-surplus/mongodb"}
2024-09-09T11:35:35.492Z	DEBUG	agent/agent_readiness.go:111	The Agent in the Pod 'mongodb-2' hasn't reached the goal state yet (goal: 30, agent: 29)	{"ReplicaSet": "mongodb-surplus/mongodb"}
2024-09-09T11:35:35.492Z	INFO	controllers/mongodb_status_options.go:110	ReplicaSet is not yet ready, retrying in 10 seconds

Below we assume that your replicaset database pods are named mongo-<>. For instance:

❯ k get pods
NAME      READY   STATUS    RESTARTS   AGE
mongo-0   2/2     Running   0          19h
mongo-1   2/2     Running   0          19h
                                                                                     
❯ k get mdbc
NAME    PHASE     VERSION
mongo   Running   4.4.0

yaml definitions of your MongoDB Deployment(s):
- kubectl get mdbc -oyaml

apiVersion: v1
items:
- apiVersion: mongodbcommunity.mongodb.com/v1
  kind: MongoDBCommunity
  metadata:
    annotations:
      mongodb.com/v1.lastAppliedMongoDBVersion: 6.0.17
    creationTimestamp: "2024-01-03T07:47:03Z"
    generation: 48
    labels:
      k8slens-edit-resource-version: v1
    name: mongodb
    namespace: mongodb-<SENSITIVE>
    resourceVersion: "391080428"
    uid: 8dbc92a1-061b-4ebb-a2be-d1b5dd6d696b
  spec:
    additionalMongodConfig:
      storage.wiredTiger.engineConfig.journalCompressor: zlib
    arbiters: 1
    members: 3
    security:
      authentication:
        ignoreUnknownUsers: true
        modes:
        - SCRAM
    statefulSet:
      spec:
        template:
          spec:
            affinity:
              nodeAffinity:
                requiredDuringSchedulingIgnoredDuringExecution:
                  nodeSelectorTerms:
                  - matchExpressions:
                    - key: NodeGroup
                      operator: In
                      values:
                      - <SENSITIVE>
              podAntiAffinity:
                preferredDuringSchedulingIgnoredDuringExecution:
                - podAffinityTerm:
                    labelSelector:
                      matchExpressions:
                      - key: app
                        operator: In
                        values:
                        - mongodb-<SENSITIVE>
                    topologyKey: kubernetes.io/hostname
                  weight: 100
            containers:
            - name: mongod
              resources:
                limits:
                  cpu: "1"
                  memory: 2Gi
                requests:
                  cpu: 500m
                  memory: 1Gi
        volumeClaimTemplates:
        - metadata:
            name: data-volume
          spec:
            accessModes:
            - ReadWriteOnce
            resources:
              requests:
                storage: 70G
            storageClassName: ebs-sc
        - metadata:
            name: logs-volume
          spec:
            accessModes:
            - ReadWriteOnce
            resources:
              requests:
                storage: 10G
            storageClassName: ebs-sc
    type: ReplicaSet
    version: 6.0.17
  status:
    currentMongoDBMembers: 3
    currentStatefulSetReplicas: 3
    message: ReplicaSet is not yet ready, retrying in 10 seconds
    mongoUri: mongodb://mongodb-0.mongodb-svc.mongodb-surplus.svc.cluster.local:27017,mongodb-1.mongodb-svc.mongodb-surplus.svc.cluster.local:27017,mongodb-2.mongodb-svc.mongodb-surplus.svc.cluster.local:27017/?replicaSet=mongodb
    phase: Pending
    version: 6.0.17
kind: List
metadata:
  resourceVersion: ""

yaml definitions of your kubernetes objects like the statefulset(s), pods (we need to see the state of the containers):
- kubectl get sts -oyaml

apiVersion: v1
items:
- apiVersion: apps/v1
  kind: StatefulSet
  metadata:
    creationTimestamp: "2024-09-06T13:47:18Z"
    generation: 4
    name: mongodb
    namespace: mongodb-XXX
    ownerReferences:
    - apiVersion: mongodbcommunity.mongodb.com/v1
      blockOwnerDeletion: true
      controller: true
      kind: MongoDBCommunity
      name: mongodb
      uid: 8dbc92a1-061b-4ebb-a2be-d1b5dd6d696b
    resourceVersion: "391084063"
    uid: 25fa6a25-5016-4a16-af39-8c6907338a49
  spec:
    persistentVolumeClaimRetentionPolicy:
      whenDeleted: Retain
      whenScaled: Retain
    podManagementPolicy: OrderedReady
    replicas: 3
    revisionHistoryLimit: 10
    selector:
      matchLabels:
        app: mongodb-XXX
    serviceName: mongodb-XXX
    template:
      metadata:
        annotations:
          kubectl.kubernetes.io/restartedAt: "2024-09-09T07:49:13Z"
        creationTimestamp: null
        labels:
          app: mongodb-XXX
      spec:
        affinity:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
              - matchExpressions:
                - key: NodeGroup
                  operator: In
                  values:
                  - XXX
          podAntiAffinity:
            preferredDuringSchedulingIgnoredDuringExecution:
            - podAffinityTerm:
                labelSelector:
                  matchExpressions:
                  - key: app
                    operator: In
                    values:
                    - mongodb-XXX
                topologyKey: kubernetes.io/hostname
              weight: 100
        containers:
        - args:
          - ""
          command:
          - /bin/sh
          - -c
          - "\nif [ -e \"/hooks/version-upgrade\" ]; then\n\t#run post-start hook
            to handle version changes (if exists)\n    /hooks/version-upgrade\nfi\n\n#
            wait for config and keyfile to be created by the agent\nwhile ! [ -f /data/automation-mongod.conf
            -a -f /var/lib/mongodb-mms-automation/authentication/keyfile ]; do sleep
            3 ; done ; sleep 2 ;\n\n# start mongod with this configuration\nexec mongod
            -f /data/automation-mongod.conf;\n\n"
          env:
          - name: AGENT_STATUS_FILEPATH
            value: /healthstatus/agent-health-status.json
          image: docker.io/mongo:6.0.17
          imagePullPolicy: IfNotPresent
          name: mongod
          resources:
            limits:
              cpu: "1"
              memory: 2Gi
            requests:
              cpu: 500m
              memory: 1Gi
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /data
            name: data-volume
          - mountPath: /healthstatus
            name: healthstatus
          - mountPath: /hooks
            name: hooks
          - mountPath: /var/log/mongodb-mms-automation
            name: logs-volume
          - mountPath: /var/lib/mongodb-mms-automation/authentication
            name: mongodb-keyfile
          - mountPath: /tmp
            name: tmp
        - command:
          - /bin/bash
          - -c
          - |-
            current_uid=$(id -u)
            declare -r current_uid
            if ! grep -q "${current_uid}" /etc/passwd ; then
            sed -e "s/^mongodb:/builder:/" /etc/passwd > /tmp/passwd
            echo "mongodb:x:$(id -u):$(id -g):,,,:/:/bin/bash" >> /tmp/passwd
            export NSS_WRAPPER_PASSWD=/tmp/passwd
            export LD_PRELOAD=libnss_wrapper.so
            export NSS_WRAPPER_GROUP=/etc/group
            fi
            agent/mongodb-agent -healthCheckFilePath=/var/log/mongodb-mms-automation/healthstatus/agent-health-status.json -serveStatusPort=5000 -cluster=/var/lib/automation/config/cluster-config.json -skipMongoStart -noDaemonize -useLocalMongoDbTools -logFile /var/log/mongodb-mms-automation/automation-agent.log -logLevel INFO -maxLogFileDurationHrs 24
          env:
          - name: AGENT_STATUS_FILEPATH
            value: /var/log/mongodb-mms-automation/healthstatus/agent-health-status.json
          - name: AUTOMATION_CONFIG_MAP
            value: mongodb-config
          - name: HEADLESS_AGENT
            value: "true"
          - name: POD_NAMESPACE
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.namespace
          image: quay.io/mongodb/mongodb-agent:12.0.15.7646-1
          imagePullPolicy: Always
          name: mongodb-agent
          readinessProbe:
            exec:
              command:
              - /opt/scripts/readinessprobe
            failureThreshold: 40
            initialDelaySeconds: 5
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 1
          resources:
            limits:
              cpu: "1"
              memory: 500M
            requests:
              cpu: 500m
              memory: 400M
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /opt/scripts
            name: agent-scripts
          - mountPath: /var/lib/automation/config
            name: automation-config
            readOnly: true
          - mountPath: /data
            name: data-volume
          - mountPath: /var/log/mongodb-mms-automation/healthstatus
            name: healthstatus
          - mountPath: /var/log/mongodb-mms-automation
            name: logs-volume
          - mountPath: /var/lib/mongodb-mms-automation/authentication
            name: mongodb-keyfile
          - mountPath: /tmp
            name: tmp
        dnsPolicy: ClusterFirst
        initContainers:
        - command:
          - cp
          - version-upgrade-hook
          - /hooks/version-upgrade
          image: quay.io/mongodb/mongodb-kubernetes-operator-version-upgrade-post-start-hook:1.0.6
          imagePullPolicy: Always
          name: mongod-posthook
          resources:
            limits:
              cpu: "1"
              memory: 500M
            requests:
              cpu: 500m
              memory: 400M
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /hooks
            name: hooks
        - command:
          - cp
          - /probes/readinessprobe
          - /opt/scripts/readinessprobe
          image: quay.io/mongodb/mongodb-kubernetes-readinessprobe:1.0.12
          imagePullPolicy: Always
          name: mongodb-agent-readinessprobe
          resources:
            limits:
              cpu: "1"
              memory: 500M
            requests:
              cpu: 500m
              memory: 400M
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /opt/scripts
            name: agent-scripts
        restartPolicy: Always
        schedulerName: default-scheduler
        securityContext:
          fsGroup: 2000
          runAsNonRoot: true
          runAsUser: 2000
        serviceAccount: mongodb-database
        serviceAccountName: mongodb-database
        terminationGracePeriodSeconds: 30
        volumes:
        - emptyDir: {}
          name: agent-scripts
        - name: automation-config
          secret:
            defaultMode: 416
            secretName: mongodb-config
        - emptyDir: {}
          name: healthstatus
        - emptyDir: {}
          name: hooks
        - emptyDir: {}
          name: mongodb-keyfile
        - emptyDir: {}
          name: tmp
    updateStrategy:
      type: RollingUpdate
    volumeClaimTemplates:
    - apiVersion: v1
      kind: PersistentVolumeClaim
      metadata:
        creationTimestamp: null
        name: data-volume
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 70G
        storageClassName: ebs-sc
        volumeMode: Filesystem
      status:
        phase: Pending
    - apiVersion: v1
      kind: PersistentVolumeClaim
      metadata:
        creationTimestamp: null
        name: logs-volume
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 10G
        storageClassName: ebs-sc
        volumeMode: Filesystem
      status:
        phase: Pending
  status:
    availableReplicas: 0
    collisionCount: 0
    currentReplicas: 3
    currentRevision: mongodb-6847cc6f7
    observedGeneration: 4
    replicas: 3
    updateRevision: mongodb-6847cc6f7
    updatedReplicas: 3
- apiVersion: apps/v1
  kind: StatefulSet
  metadata:
    creationTimestamp: "2024-09-06T13:47:10Z"
    generation: 8
    name: mongodb-arb
    namespace: mongodb-XXX
    ownerReferences:
    - apiVersion: mongodbcommunity.mongodb.com/v1
      blockOwnerDeletion: true
      controller: true
      kind: MongoDBCommunity
      name: mongodb
      uid: 8dbc92a1-061b-4ebb-a2be-d1b5dd6d696b
    resourceVersion: "391081887"
    uid: 1267641d-8cfa-4d13-ae06-a79c5255facc
  spec:
    persistentVolumeClaimRetentionPolicy:
      whenDeleted: Retain
      whenScaled: Retain
    podManagementPolicy: OrderedReady
    replicas: 1
    revisionHistoryLimit: 10
    selector:
      matchLabels:
        app: mongodb-XXX
    serviceName: mongodb-XXX
    template:
      metadata:
        annotations:
          kubectl.kubernetes.io/restartedAt: "2024-09-09T07:45:41Z"
        creationTimestamp: null
        labels:
          app: mongodb-XXX
      spec:
        affinity:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
              - matchExpressions:
                - key: NodeGroup
                  operator: In
                  values:
                  - XXX
          podAntiAffinity:
            preferredDuringSchedulingIgnoredDuringExecution:
            - podAffinityTerm:
                labelSelector:
                  matchExpressions:
                  - key: app
                    operator: In
                    values:
                    - mongodb-XXX
                topologyKey: kubernetes.io/hostname
              weight: 100
        containers:
        - args:
          - ""
          command:
          - /bin/sh
          - -c
          - "\nif [ -e \"/hooks/version-upgrade\" ]; then\n\t#run post-start hook
            to handle version changes (if exists)\n    /hooks/version-upgrade\nfi\n\n#
            wait for config and keyfile to be created by the agent\nwhile ! [ -f /data/automation-mongod.conf
            -a -f /var/lib/mongodb-mms-automation/authentication/keyfile ]; do sleep
            3 ; done ; sleep 2 ;\n\n# start mongod with this configuration\nexec mongod
            -f /data/automation-mongod.conf;\n\n"
          env:
          - name: AGENT_STATUS_FILEPATH
            value: /healthstatus/agent-health-status.json
          image: docker.io/mongo:6.0.17
          imagePullPolicy: IfNotPresent
          name: mongod
          resources:
            limits:
              cpu: "1"
              memory: 2Gi
            requests:
              cpu: 500m
              memory: 1Gi
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /data
            name: data-volume
          - mountPath: /healthstatus
            name: healthstatus
          - mountPath: /hooks
            name: hooks
          - mountPath: /var/log/mongodb-mms-automation
            name: logs-volume
          - mountPath: /var/lib/mongodb-mms-automation/authentication
            name: mongodb-keyfile
          - mountPath: /tmp
            name: tmp
        - command:
          - /bin/bash
          - -c
          - |-
            current_uid=$(id -u)
            declare -r current_uid
            if ! grep -q "${current_uid}" /etc/passwd ; then
            sed -e "s/^mongodb:/builder:/" /etc/passwd > /tmp/passwd
            echo "mongodb:x:$(id -u):$(id -g):,,,:/:/bin/bash" >> /tmp/passwd
            export NSS_WRAPPER_PASSWD=/tmp/passwd
            export LD_PRELOAD=libnss_wrapper.so
            export NSS_WRAPPER_GROUP=/etc/group
            fi
            agent/mongodb-agent -healthCheckFilePath=/var/log/mongodb-mms-automation/healthstatus/agent-health-status.json -serveStatusPort=5000 -cluster=/var/lib/automation/config/cluster-config.json -skipMongoStart -noDaemonize -useLocalMongoDbTools -logFile /var/log/mongodb-mms-automation/automation-agent.log -logLevel INFO -maxLogFileDurationHrs 24
          env:
          - name: AGENT_STATUS_FILEPATH
            value: /var/log/mongodb-mms-automation/healthstatus/agent-health-status.json
          - name: AUTOMATION_CONFIG_MAP
            value: mongodb-config
          - name: HEADLESS_AGENT
            value: "true"
          - name: POD_NAMESPACE
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.namespace
          image: quay.io/mongodb/mongodb-agent:12.0.15.7646-1
          imagePullPolicy: Always
          name: mongodb-agent
          readinessProbe:
            exec:
              command:
              - /opt/scripts/readinessprobe
            failureThreshold: 40
            initialDelaySeconds: 5
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 1
          resources:
            limits:
              cpu: "1"
              memory: 500M
            requests:
              cpu: 500m
              memory: 400M
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /opt/scripts
            name: agent-scripts
          - mountPath: /var/lib/automation/config
            name: automation-config
            readOnly: true
          - mountPath: /data
            name: data-volume
          - mountPath: /var/log/mongodb-mms-automation/healthstatus
            name: healthstatus
          - mountPath: /var/log/mongodb-mms-automation
            name: logs-volume
          - mountPath: /var/lib/mongodb-mms-automation/authentication
            name: mongodb-keyfile
          - mountPath: /tmp
            name: tmp
        dnsPolicy: ClusterFirst
        initContainers:
        - command:
          - cp
          - version-upgrade-hook
          - /hooks/version-upgrade
          image: quay.io/mongodb/mongodb-kubernetes-operator-version-upgrade-post-start-hook:1.0.6
          imagePullPolicy: Always
          name: mongod-posthook
          resources:
            limits:
              cpu: "1"
              memory: 500M
            requests:
              cpu: 500m
              memory: 400M
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /hooks
            name: hooks
        - command:
          - cp
          - /probes/readinessprobe
          - /opt/scripts/readinessprobe
          image: quay.io/mongodb/mongodb-kubernetes-readinessprobe:1.0.12
          imagePullPolicy: Always
          name: mongodb-agent-readinessprobe
          resources:
            limits:
              cpu: "1"
              memory: 500M
            requests:
              cpu: 500m
              memory: 400M
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /opt/scripts
            name: agent-scripts
        restartPolicy: Always
        schedulerName: default-scheduler
        securityContext:
          fsGroup: 2000
          runAsNonRoot: true
          runAsUser: 2000
        serviceAccount: XXX
        serviceAccountName: XXX
        terminationGracePeriodSeconds: 30
        volumes:
        - emptyDir: {}
          name: agent-scripts
        - name: automation-config
          secret:
            defaultMode: 416
            secretName: mongodb-config
        - emptyDir: {}
          name: healthstatus
        - emptyDir: {}
          name: hooks
        - emptyDir: {}
          name: mongodb-keyfile
        - emptyDir: {}
          name: tmp
    updateStrategy:
      type: RollingUpdate
    volumeClaimTemplates:
    - apiVersion: v1
      kind: PersistentVolumeClaim
      metadata:
        creationTimestamp: null
        name: data-volume
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 70G
        storageClassName: ebs-sc
        volumeMode: Filesystem
      status:
        phase: Pending
    - apiVersion: v1
      kind: PersistentVolumeClaim
      metadata:
        creationTimestamp: null
        name: logs-volume
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 10G
        storageClassName: ebs-sc
        volumeMode: Filesystem
      status:
        phase: Pending
  status:
    availableReplicas: 1
    collisionCount: 0
    currentReplicas: 1
    currentRevision: mongodb-arb-5f6bc75bb8
    observedGeneration: 8
    readyReplicas: 1
    replicas: 1
    updateRevision: mongodb-arb-5f6bc75bb8
    updatedReplicas: 1
kind: List
metadata:
  resourceVersion: ""

The agent clusterconfig of the faulty members:
- kubectl exec -it mongo-0 -c mongodb-agent -- cat /var/lib/automation/config/cluster-config.json

{"version":32,"processes":[{"name":"mongodb-0","disabled":false,"hostname":"mongodb-0.mongodb-svc.mongodb-xxx.svc.cluster.local","args2_6":{"net":{"port":27017},"repl
ication":{"replSetName":"mongodb"},"storage":{"dbPath":"/data","wiredTiger":{"engineConfig":{"journalCompressor":"zlib"}}}},"featureCompatibilityVersion":"6.0","processTy
pe":"mongod","version":"6.0.17","authSchemaVersion":5},{"name":"mongodb-1","disabled":false,"hostname":"mongodb-1.mongodb-svc.mongodb-xxx.svc.cluster.local","args2_6"
:{"net":{"port":27017},"replication":{"replSetName":"mongodb"},"storage":{"dbPath":"/data","wiredTiger":{"engineConfig":{"journalCompressor":"zlib"}}}},"featureCompatibil
ityVersion":"6.0","processType":"mongod","version":"6.0.17","authSchemaVersion":5},{"name":"mongodb-2","disabled":false,"hostname":"mongodb-2.mongodb-svc.mongodb-xxx.
svc.cluster.local","args2_6":{"net":{"port":27017},"replication":{"replSetName":"mongodb"},"storage":{"dbPath":"/data","wiredTiger":{"engineConfig":{"journalCompressor":"
zlib"}}}},"featureCompatibilityVersion":"6.0","processType":"mongod","version":"6.0.17","authSchemaVersion":5},{"name":"mongodb-arb-0","disabled":false,"hostname":"mongod
b-arb-0.mongodb-svc.mongodb-xxx.svc.cluster.local","args2_6":{"net":{"port":27017},"replication":{"replSetName":"mongodb"},"storage":{"dbPath":"/data","wiredTiger":{"
engineConfig":{"journalCompressor":"zlib"}}}},"featureCompatibilityVersion":"6.0","processType":"mongod","version":"6.0.17","authSchemaVersion":5}],"replicaSets":[{"_id":
"mongodb","members":[{"_id":0,"host":"mongodb-0","arbiterOnly":false,"votes":1,"priority":1},{"_id":1,"host":"mongodb-1","arbiterOnly":false,"votes":1,"priority":1},{"_id
":2,"host":"mongodb-2","arbiterOnly":false,"votes":1,"priority":1},{"_id":100,"host":"mongodb-arb-0","arbiterOnly":true,"votes":1,"priority":1}],"protocolVersion":"1","nu
mberArbiters":1}],"auth":{"usersWanted":[{"mechanisms":[],"roles":[{"role":"clusterAdmin","db":"admin"},{"role":"userAdminAnyDatabase","db":"admin"}],"user":"admin-user",
"db":"admin","authenticationRestrictions":[],"scramSha256Creds":{"iterationCount":15000,"salt":"WioeMJQXT8w9Coif5Fq3gqV1OfeDi64Bvq/maw==","serverKey":"iK8SJLwCzmk95+mUePC
3wrpGw29Tfx9vN+ZCKSMPKMM=","storedKey":"Z33GU9ix2W++nlnkFBIbYP7kEATZ/6sDVQqdhEd+tT0="},"scramSha1Creds":{"iterationCount":10000,"salt":"Q9mmbNXpyLDRtYmoln1xgA==","serverK
ey":"AmaNP+YmbrNf23l8URaZAZKKOz0=","storedKey":"0d8SscAfTMph+2aW416TXB1/UZw="}},{"mechanisms":[],"roles":[{"role":"readWrite","db":"xxx"}],"user":"xxx-prod-user","db":"ad
min","authenticationRestrictions":[],"scramSha256Creds":{"iterationCount":15000,"salt":"GOkTsWgdrct5KSSQtTHC20myEJM76v5OMEGXOA==","serverKey":"ksqF9YIWnI50+bQJhjl0/zA1a0H
0UpcNnzxnFEjciV4=","storedKey":"GyxjpwCp9hTHsK5CX2ObkIs73NP2zL1VrwbCQTLDvGE="},"scramSha1Creds":{"iterationCount":10000,"salt":"4RAcRyAxnRCQQhcHWRDA2w==","serverKey":"59K
Q8PQV/rS4zxSuVca/tQbDNWw=","storedKey":"v68O4bx8u7/RNIks1WBvmXIJ+H8="}},{"mechanisms":[],"roles":[{"role":"readWrite","db":"xxx"}],"user":"xxx-prod-user","db":"admin","au
thenticationRestrictions":[],"scramSha256Creds":{"iterationCount":15000,"salt":"vYX0jOTF0NvPPmpQm1oz/b7v1/sAnFOMWdm5Pg==","serverKey":"MWuSeUedUk33YD57g/pVw+kV89vQK8OmTib
RLl2hR0U=","storedKey":"ffXuxQ5HTf0FcH2FdcNvKigWSO/TgPdF0elXk9iYX3E="},"scramSha1Creds":{"iterationCount":10000,"salt":"zV90H0Z2XJ8sCiupCSK3PQ==","serverKey":"IwdxN4BVrGS
qyLPXDhrbZKFsbtc=","storedKey":"e5XxJTwdueUyUyJd3ioqQFuEKbc="}},{"mechanisms":[],"roles":[{"role":"readWrite","db":"xxx"},{"role":"readWrite","db":"xxx"}],"user":"xxx-
user","db":"admin","authenticationRestrictions":[],"scramSha256Creds":{"iterationCount":15000,"salt":"0HwrZJa9FIMCy5r4erCq7o2gb/RSaHofCV1XMw==","serverKey":"4FTg/fstci+8W
6BZE4jfyXpLJr9/f4zsuDiKrLnBcgg=","storedKey":"lThVu1E2tv14Q7H58DYYNK1jlqXaIZDCp/Omp44wR1A="},"scramSha1Creds":{"iterationCount":10000,"salt":"gDxOqOLC16/e/WvhWSGDdA==","s
erverKey":"q6SKd30cOY+PFQnqRFMpdgmNTFA=","storedKey":"9swpmpNkjLofRRRprZWGrCBoolk="}},{"mechanisms":[],"roles":[{"role":"dbAdmin","db":"admin"},{"role":"userAdminAnyDatab
ase","db":"admin"},{"role":"readWrite","db":"xxx"},{"role":"readWrite","db":"xxx"},{"role":"readWrite","db":"local"}],"user":"xxx-user","db":"admin","authenticationRe
strictions":[],"scramSha256Creds":{"iterationCount":15000,"salt":"m40BqXb1jbVT4NIN9NjAdbTdQp84O6KtEbRjgA==","serverKey":"hztnbCDXJs0zBcwtaJsLquRtEgCHDykKj04SaQ3eLn8=","st
oredKey":"cftguOpTPNr4QYvL2XtrRlybPzi96CgzGoZ91EVZg2g="},"scramSha1Creds":{"iterationCount":10000,"salt":"SofiWm+P4s3RwvvIxflOOQ==","serverKey":"73knk0VrQPm6PWSYM5PFYwUK1
lA=","storedKey":"wnYGbRIv2qPtcpv3j4r+lUX8x/4="}},{"mechanisms":[],"roles":[{"role":"read","db":"cps"},{"role":"read","db":"xxx"},{"role":"changestreamrole","db":"admin
"}],"user":"xxx-user","db":"admin","authenticationRestrictions":[],"scramSha256Creds":{"iterationCount":15000,"salt":"wzzV+4JFqwIGw1mAzRucb1oiIcYYR/gdcwZyJw==","serverKey
":"Ty105G/oxXhrv9UwgIqqXHO7ZxM5LYW9T/mta7uiQYo=","storedKey":"63E/7kQy2g4O/MUd0a62q8pQNBtITkJ74dUagrRESO4="},"scramSha1Creds":{"iterationCount":10000,"salt":"R/PWGnO94tyg
xdNjIavdkQ==","serverKey":"Gr8vtcLPyb/pR/h8GRivOrY6/sE=","storedKey":"kwWqu/dnDzNTDOEu03TcgeDWxPY="}}],"disabled":false,"authoritativeSet":false,"autoAuthMechanisms":["SC
RAM-SHA-256"],"autoAuthMechanism":"SCRAM-SHA-256","deploymentAuthMechanisms":["SCRAM-SHA-256"],"autoUser":"mms-automation","key":"8tQDoV1eZdKJvpc7cA8rtu939Glj1IgsL9CNE1nf
7SuZMFw8Te47PmhA9Z1NPi27cRw5+bs16kenEAPP82V7v51Xcv5Q9xPZKUxltKlc3t9cfq2Q7Il42DJsrjhQUhne5lKNghWLRHPSFVb8IHbuImgPcvu7mPz6VsYClu6Lno5ewW3ziIvilIW/2xvpxqG0qz4jvz5/cmtTWeNn7V
JzNOYwYurWdFfvdUDL+Z+kQqcbsa95SSYA8217h6aKE2guOwlVpK0VZBYCPACg+ID1dARawAHG7xCA92lFttymLfgu8kbUXeW6RxBsgwz5iuOXjiIrm8XpWhpWHLJNplf5YaGsqBMIbRlH3tAXGv6auqLaiGup3+kQXDJNwC7J
uaa5F0FGXg+PdQPMOH4xv2SZy0zGHh988CaEtXhBVWiQ06FhnNWyxziLCl8BGJpCbD2bsGiiUBcUHvxkCybARhguLYdnS60+tlJcMIr3rpt7MTgRuHhwki0gX1KcVmEe+tPeg57RdqcVcEEqqHYwc4Ghkk/PF/10BlsO0NiUZm
JxZqow7ffSRZHtZ/VKW2og6CZp2V3BaYZmzYwHn5XFFRCDNUu8mbwvtySQVSlVVY4GbKRkgepYsWrYGc20yPH7Hzni9b8N0zCmX8HPy5icn8+jf4z7BRw=","keyfile":"/var/lib/mongodb-mms-automation/authent
ication/keyfile","keyfileWindows":"%SystemDrive%\\MMSAutomation\\versions\\keyfile","autoPwd":"eX-rXNR2PB_ytwagyylk"},"tls":{"CAFilePath":"","clientCertificateMode":"OPTI
ONAL"},"mongoDbVersions":[{"name":"6.0.17","builds":[{"platform":"linux","url":"","gitVersion":"","architecture":"amd64","flavor":"rhel","minOsVersion":"","maxOsVersion":
"","modules":[]},{"platform":"linux","url":"","gitVersion":"","architecture":"amd64","flavor":"ubuntu","minOsVersion":"","maxOsVersion":"","modules":[]},{"platform":"linu
x","url":"","gitVersion":"","architecture":"aarch64","flavor":"ubuntu","minOsVersion":"","maxOsVersion":"","modules":[]},{"platform":"linux","url":"","gitVersion":"","arc
hitecture":"aarch64","flavor":"rhel","minOsVersion":"","maxOsVersion":"","modules":[]}]}],"backupVersions":[],"monitoringVersions":[],"options":{"downloadBase":"/var/lib/
mongodb-mms-automation"}}

The agent health status of the faulty members:
- kubectl exec -it mongo-0 -c mongodb-agent -- cat /var/log/mongodb-mms-automation/healthstatus/agent-health-status.json

{"statuses":{"mongodb-arb-0":{"IsInGoalState":false,"LastMongoUpTime":1725903583,"ExpectedToBeUp":true,"ReplicationStatus":-1}},"mmsStatus":{"mongodb-arb-0":{"name":"mong
odb-arb-0","lastGoalVersionAchieved":-1,"plans":[{"automationConfigVersion":32,"started":"2024-09-09T17:37:58.413084367Z","completed":null,"moves":[{"move":"Start","moveD
oc":"Start the process","steps":[{"step":"StartFresh","stepDoc":"Start a mongo instance  (start fresh)","isWaitStep":false,"started":"2024-09-09T17:37:58.413103457Z","com
pleted":"2024-09-09T17:38:02.701441838Z","result":"success"}]},{"move":"WaitRsInit","moveDoc":"Wait for the replica set to be initialized by another member","steps":[{"st
ep":"WaitRsInit","stepDoc":"Wait for the replica set to be initialized by another member","isWaitStep":true,"started":"2024-09-09T17:38:02.701493829Z","completed":null,"r
esult":"wait"}]},{"move":"WaitFeatureCompatibilityVersionCorrect","moveDoc":"Wait for featureCompatibilityVersion to be right","steps":[{"step":"WaitFeatureCompatibilityV
ersionCorrect","stepDoc":"Wait for featureCompatibilityVersion to be right","isWaitStep":true,"started":null,"completed":null,"result":""}]}]}],"errorCode":0,"errorString
":""}}}

The verbose agent logs of the faulty members:
- kubectl exec -it mongo-0 -c mongodb-agent -- cat /var/log/mongodb-mms-automation/automation-agent-verbose.log

[2024-09-09T17:40:52.041+0000] [.info] [src/director/director.go:tracef:806] <mongodb-arb-0> [17:40:52.041] because
[All the following are true:
    ['currentState.Up' = true]
    ['currentState.CanRsInit' = false]
    ['desiredState.ReplSetConf' != <nil> ('desiredState.ReplSetConf' = ReplSetConfig{id=mongodb,version=0,commitmentStatus=false,configsvr=false,protocolVersion=1,forcePr
otocolVersion=false,writeConcernMajorityJournalDefault=,members={id:0,HostPort:mongodb-0.mongodb-svc.mongodb-xxx.svc.cluster.local:27017,ArbiterOnly:falsePriority:1,H
idden:false,SecondaryDelaySecs:0,Votes:1,Tags:map[]},{id:1,HostPort:mongodb-1.mongodb-svc.mongodb-xxx.svc.cluster.local:27017,ArbiterOnly:falsePriority:1,Hidden:false
,SecondaryDelaySecs:0,Votes:1,Tags:map[]},{id:2,HostPort:mongodb-2.mongodb-svc.mongodb-xxx.svc.cluster.local:27017,ArbiterOnly:falsePriority:1,Hidden:false,SecondaryD
elaySecs:0,Votes:1,Tags:map[]},{id:100,HostPort:mongodb-arb-0.mongodb-svc.mongodb-xxx.svc.cluster.local:27017,ArbiterOnly:truePriority:1,Hidden:false,SecondaryDelaySe
cs:0,Votes:1,Tags:map[]},settings=map[]})]
    ['currentState.ReplSetConf' = <nil>]
]
[2024-09-09T17:40:52.041+0000] [.info] [src/director/director.go:planAndExecute:575] <mongodb-arb-0> [17:40:52.041] Step=WaitRsInit as part of Move=WaitRsInit in plan fai
led : <mongodb-arb-0> [17:40:52.041] Postcondition not yet met for step WaitRsInit because
['currentState.ReplSetConf' = <nil>].
 Recomputing a plan...
[2024-09-09T17:40:52.362+0000] [.warn] [metrics/collector/util.go:getPingStatus:84] <hardwareMetricsCollector> [17:40:52.362] Failed to fetch replStatus for mongodb-arb-0
 : <hardwareMetricsCollector> [17:40:52.362] Error executing WithClientFor() for cp=mongodb-arb-0.mongodb-svc.mongodb-surplus.svc.cluster.local:27017 (local=false) connec
tMode=SingleConnect : <hardwareMetricsCollector> [17:40:52.362] Error running command for runCommandWithTimeout(dbName=admin, cmd=[{replSetGetStatus 1}]) : result={} iden
tityUsed=__system@local[[MONGODB-CR/SCRAM-SHA-1 SCRAM-SHA-256]][668] : (NotYetInitialized) no replset config has been received
[2024-09-09T17:40:52.678+0000] [.info] [src/config/config.go:ReadClusterConfig:433] [17:40:52.678] Retrieving cluster config from /var/lib/automation/config/cluster-confi
g.json...
[2024-09-09T17:40:52.678+0000] [.info] [main/components/agent.go:LoadClusterConfig:277] [17:40:52.678] clusterConfig unchanged
[2024-09-09T17:40:53.072+0000] [.info] [src/mongoclientservice/mongoclientservice.go:func1:1619] [17:40:53.072] Testing auth with username __system db=local to mongodb-ar
b-0.mongodb-svc.mongodb-xxx.svc.cluster.local:27017 (local=false) connectMode=SingleConnect ipversion=0 tls=false
[2024-09-09T17:40:53.081+0000] [.info] [src/mongoctl/processctl.go:GetKeyHashes:2080] <mongodb-arb-0> [17:40:53.081] Able to successfully auth to mongodb-arb-0.mongodb-sv
c.mongodb-xxx.svc.cluster.local:27017 (local=false) using desired auth key
[2024-09-09T17:40:53.108+0000] [.info] [src/mongoctl/processctl.go:Update:3555] <mongodb-arb-0> [17:40:53.108] <DB_WRITE> Updated with query map[] and update [{$set [{age
ntFeatures [StateCache]} {nextVersion 32}]}] and upsert=true on local.clustermanager
[2024-09-09T17:40:53.125+0000] [.info] [src/director/director.go:computePlan:278] <mongodb-arb-0> [17:40:53.125] ... process has a plan : WaitRsInit,WaitFeatureCompatibil
ityVersionCorrect
[2024-09-09T17:40:53.125+0000] [.info] [src/director/director.go:tracef:806] <mongodb-arb-0> [17:40:53.125] Running step: 'WaitRsInit' of move 'WaitRsInit'
[2024-09-09T17:40:53.125+0000] [.info] [src/director/director.go:tracef:806] <mongodb-arb-0> [17:40:53.125] because
[All the following are true:
    ['currentState.Up' = true]
    ['currentState.CanRsInit' = false]
    ['desiredState.ReplSetConf' != <nil> ('desiredState.ReplSetConf' = ReplSetConfig{id=mongodb,version=0,commitmentStatus=false,configsvr=false,protocolVersion=1,forcePr
otocolVersion=false,writeConcernMajorityJournalDefault=,members={id:0,HostPort:mongodb-0.mongodb-svc.mongodb-xxx.svc.cluster.local:27017,ArbiterOnly:falsePriority:1,H
idden:false,SecondaryDelaySecs:0,Votes:1,Tags:map[]},{id:1,HostPort:mongodb-1.mongodb-svc.mongodb-xxx.svc.cluster.local:27017,ArbiterOnly:falsePriority:1,Hidden:false
,SecondaryDelaySecs:0,Votes:1,Tags:map[]},{id:2,HostPort:mongodb-2.mongodb-svc.mongodb-xxx.svc.cluster.local:27017,ArbiterOnly:falsePriority:1,Hidden:false,SecondaryD
elaySecs:0,Votes:1,Tags:map[]},{id:100,HostPort:mongodb-arb-0.mongodb-svc.mongodb-xxx.svc.cluster.local:27017,ArbiterOnly:truePriority:1,Hidden:false,SecondaryDelaySe
cs:0,Votes:1,Tags:map[]},settings=map[]})]
    ['currentState.ReplSetConf' = <nil>]
]
[2024-09-09T17:40:53.125+0000] [.info] [src/director/director.go:planAndExecute:575] <mongodb-arb-0> [17:40:53.125] Step=WaitRsInit as part of Move=WaitRsInit in plan fai
led : <mongodb-arb-0> [17:40:53.125] Postcondition not yet met for step WaitRsInit because
['currentState.ReplSetConf' = <nil>].
 Recomputing a plan...
[2024-09-09T17:40:53.364+0000] [.warn] [metrics/collector/util.go:getPingStatus:84] <hardwareMetricsCollector> [17:40:53.364] Failed to fetch replStatus for mongodb-arb-0
 : <hardwareMetricsCollector> [17:40:53.364] Error executing WithClientFor() for cp=mongodb-arb-0.mongodb-svc.mongodb-xxx.svc.cluster.local:27017 (local=false) connec
tMode=SingleConnect : <hardwareMetricsCollector> [17:40:53.364] Error running command for runCommandWithTimeout(dbName=admin, cmd=[{replSetGetStatus 1}]) : result={} iden
tityUsed=__system@local[[MONGODB-CR/SCRAM-SHA-1 SCRAM-SHA-256]][668] : (NotYetInitialized) no replset config has been received
[2024-09-09T17:40:53.718+0000] [.info] [src/config/config.go:ReadClusterConfig:433] [17:40:53.718] Retrieving cluster config from /var/lib/automation/config/cluster-confi
g.json...
[2024-09-09T17:40:53.718+0000] [.info] [main/components/agent.go:LoadClusterConfig:277] [17:40:53.718] clusterConfig unchanged
[2024-09-09T17:40:54.157+0000] [.info] [src/mongoclientservice/mongoclientservice.go:func1:1619] [17:40:54.157] Testing auth with username __system db=local to mongodb-ar
b-0.mongodb-svc.mongodb-xxx.svc.cluster.local:27017 (local=false) connectMode=SingleConnect ipversion=0 tls=false
[2024-09-09T17:40:54.166+0000] [.info] [src/mongoctl/processctl.go:GetKeyHashes:2080] <mongodb-arb-0> [17:40:54.166] Able to successfully auth to mongodb-arb-0.mongodb-sv
c.mongodb-xxx.svc.cluster.local:27017 (local=false) using desired auth key
[2024-09-09T17:40:54.191+0000] [.info] [src/mongoctl/processctl.go:Update:3555] <mongodb-arb-0> [17:40:54.191] <DB_WRITE> Updated with query map[] and update [{$set [{age
ntFeatures [StateCache]} {nextVersion 32}]}] and upsert=true on local.clustermanager
[2024-09-09T17:40:54.203+0000] [.info] [src/director/director.go:computePlan:278] <mongodb-arb-0> [17:40:54.203] ... process has a plan : WaitRsInit,WaitFeatureCompatibil
ityVersionCorrect

You might not have the verbose ones, in that case the non-verbose agent logs:
- kubectl exec -it mongo-0 -c mongodb-agent -- cat /var/log/mongodb-mms-automation/automation-agent.log

[2024-09-09T17:37:58.248+0000] [header.info] [::0]        GitCommitId = 25bb5320d7087c7aa24eb6118df217a028238723
[2024-09-09T17:37:58.248+0000] [header.info] [::0]  AutomationVersion = 12.0.15.7646
[2024-09-09T17:37:58.248+0000] [header.info] [::0]          localhost = mongodb-arb-0.mongodb-svc.mongodb-xxx.svc.cluster.local
[2024-09-09T17:37:58.249+0000] [header.info] [::0] ErrorStateSleepTime = 10s
[2024-09-09T17:37:58.249+0000] [header.info] [::0] GoalStateSleepTime = 10s
[2024-09-09T17:37:58.249+0000] [header.info] [::0] NotGoalStateSleepTime = 1s
[2024-09-09T17:37:58.249+0000] [header.info] [::0]     PlanCutoffTime = 300000
[2024-09-09T17:37:58.249+0000] [header.info] [::0]       TracePlanner = false
[2024-09-09T17:37:58.249+0000] [header.info] [::0]               User = mongodb
[2024-09-09T17:37:58.249+0000] [header.info] [::0]         Go version = go1.18.5
[2024-09-09T17:37:58.249+0000] [header.info] [::0]         MmsBaseURL =
[2024-09-09T17:37:58.249+0000] [header.info] [::0]         MmsGroupId =
[2024-09-09T17:37:58.249+0000] [header.info] [::0]          HttpProxy =
[2024-09-09T17:37:58.249+0000] [header.info] [::0] DisableHttpKeepAlive = false
[2024-09-09T17:37:58.249+0000] [header.info] [::0]        HttpsCAFile =
[2024-09-09T17:37:58.249+0000] [header.info] [::0] TlsRequireValidMMSServerCertificates = true
[2024-09-09T17:37:58.249+0000] [header.info] [::0] TlsMMSServerClientCertificate =
[2024-09-09T17:37:58.249+0000] [header.info] [::0] KMIPProxyCertificateDir = /tmp
[2024-09-09T17:37:58.249+0000] [header.info] [::0] EnableLocalConfigurationServer = false
[2024-09-09T17:37:58.249+0000] [header.info] [::0] DialTimeoutSeconds = 40
[2024-09-09T17:37:58.249+0000] [header.info] [::0] KeepUnusedMongodbVersions = false
[2024-09-09T17:37:58.249+0000] [header.info] [::0] DisallowDowngrades = false
[2024-09-09T17:37:59.378+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1120] <hardwareMetricsCollector> [17:37:59.378] Server at mongodb-arb-0.mongodb-svc.mongodb
-xxx.svc.cluster.local:27017 (local=false) is down
[2024-09-09T17:37:59.479+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1120] <hardwareMetricsCollector> [17:37:59.479] Server at mongodb-arb-0.mongodb-svc.mongodb
-xxx.svc.cluster.local:27017 (local=false) is down
[2024-09-09T17:38:00.430+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1120] <hardwareMetricsCollector> [17:38:00.430] Server at mongodb-arb-0.mongodb-svc.mongodb
-xxx.svc.cluster.local:27017 (local=false) is down
[2024-09-09T17:38:00.531+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1120] <hardwareMetricsCollector> [17:38:00.531] Server at mongodb-arb-0.mongodb-svc.mongodb
-xxx.svc.cluster.local:27017 (local=false) is down
[2024-09-09T17:38:01.461+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1120] <hardwareMetricsCollector> [17:38:01.461] Server at mongodb-arb-0.mongodb-svc.mongodb
-xxx.svc.cluster.local:27017 (local=false) is down
[2024-09-09T17:38:01.569+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1120] <hardwareMetricsCollector> [17:38:01.569] Server at mongodb-arb-0.mongodb-svc.mongodb
-xxx.svc.cluster.local:27017 (local=false) is down
[2024-09-09T17:38:02.385+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1120] <hardwareMetricsCollector> [17:38:02.385] Server at mongodb-arb-0.mongodb-svc.mongodb
-xxx.svc.cluster.local:27017 (local=false) is down
[2024-09-09T17:38:02.487+0000] [.error] [src/mongoctl/processctl.go:RunCommand:1120] <hardwareMetricsCollector> [17:38:02.487] Server at mongodb-arb-0.mongodb-svc.mongodb
-xxx.svc.cluster.local:27017 (local=false) is down

The text was updated successfully, but these errors were encountered:

nammn · 2024-09-09T15:13:14Z

@KarooolisZi thanks for opening the issue! We will try to have a look at it.

nammn · 2024-09-09T15:13:58Z

@KarooolisZi can you also please supply the agenthealth status file and agent logs (you can see more in the github issue template on how to retrieve them)

KarooolisZi · 2024-09-09T17:50:21Z

Hi @nammn updated issue with required information

KarooolisZi · 2024-09-11T06:04:52Z

@nammn any updates?

KarooolisZi · 2024-09-12T10:40:05Z

Due to mentioned reason my 3 member set crashed. I can't revive them because first node is created as secondary (they had similar priority) and yet I can't initiate elections as primary is not recreated

KarooolisZi · 2024-09-16T11:38:08Z

@nammn any updates?

github-actions · 2024-11-16T02:04:29Z

This issue is being marked stale because it has been open for 60 days with no activity. Please comment if this issue is still affecting you. If there is no change, this issue will be closed in 30 days.

KarooolisZi · 2024-11-26T07:17:07Z

@nammn Hello, is there any progress?

GotoUnsigned · 2024-12-05T16:16:35Z

Hey i'm facing the same issue, is there any news ?

GotoUnsigned · 2024-12-06T08:49:15Z

@KarooolisZi Hey, i just found a way to make it work, a painfull one but it does work.
You need to delete previously created PVC of the mongodb replicasets and redeploy with arbiter. Now the arbiter is in the config, of said replicasets.

Maybe it's only the config file of each replicaset that u need to delete ? I just deleted all PVC and it worked.

KarooolisZi mentioned this issue Sep 9, 2024

Operator falsely errors and does not let to upgrade MongoDB cluster replicas #1613

Closed

github-actions bot added the stale label Nov 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MongoDB replicaset with PS or PSS setup could not have arbiter added due to crashloop for goal state #1615

MongoDB replicaset with PS or PSS setup could not have arbiter added due to crashloop for goal state #1615

KarooolisZi commented Sep 9, 2024 •

edited

Loading

nammn commented Sep 9, 2024

nammn commented Sep 9, 2024

KarooolisZi commented Sep 9, 2024

KarooolisZi commented Sep 11, 2024

KarooolisZi commented Sep 12, 2024

KarooolisZi commented Sep 16, 2024

github-actions bot commented Nov 16, 2024

KarooolisZi commented Nov 26, 2024

GotoUnsigned commented Dec 5, 2024

GotoUnsigned commented Dec 6, 2024

MongoDB replicaset with PS or PSS setup could not have arbiter added due to crashloop for goal state #1615

MongoDB replicaset with PS or PSS setup could not have arbiter added due to crashloop for goal state #1615

Comments

KarooolisZi commented Sep 9, 2024 • edited Loading

nammn commented Sep 9, 2024

nammn commented Sep 9, 2024

KarooolisZi commented Sep 9, 2024

KarooolisZi commented Sep 11, 2024

KarooolisZi commented Sep 12, 2024

KarooolisZi commented Sep 16, 2024

github-actions bot commented Nov 16, 2024

KarooolisZi commented Nov 26, 2024

GotoUnsigned commented Dec 5, 2024

GotoUnsigned commented Dec 6, 2024

KarooolisZi commented Sep 9, 2024 •

edited

Loading