Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readiness probe failed: panic: open /var/log/mongodb-mms-automation/healthstatus/agent-health-status.json: no such file or directory #1527

Open
prachiwaghulkar opened this issue Apr 18, 2024 · 21 comments
Labels

Comments

@prachiwaghulkar
Copy link

What did you do to encounter the bug?
Applied the mongodb CR using MongoDB image 5.0.26.
mongodb pod is in CrashLoopBackOff and mongodbcommunity is in Pending state.

mongodb-kubernetes-operator-54c9d54fbc-mch6k          1/1     Running            0             8m49s
staging-mongodb-0                                     0/2     CrashLoopBackOff   1 (3s ago)    26s
prachiwaghulkar@Prachis-MacBook-Pro ~ % oc get mongodbcommunity
NAME              PHASE     VERSION
staging-mongodb   Pending   

Pod logs give the following error:

oc logs -p staging-mongodb-0
Defaulted container "mongod" out of: mongod, mongodb-agent, mongod-posthook (init), mongodb-agent-readinessprobe (init)
exec /bin/sh: exec format error

Describe on the pod gives below error in events:

Warning  BackOff                 21s (x2 over 22s)  kubelet                  Back-off restarting failed container
  Warning  Unhealthy               15s                kubelet                  Readiness probe failed: panic: open /var/log/mongodb-mms-automation/healthstatus/agent-health-status.json: no such file or directory

goroutine 1 [running]:
main.main()
           /workspace/cmd/readiness/main.go:226 +0x191

What did you expect?
/var/log/mongodb-mms-automation/healthstatus/agent-health-status.json should exist and this error should not come.

What happened instead?
/var/log/mongodb-mms-automation/healthstatus/agent-health-status.json file doesn't exist and the error is thrown. mongodb pod is in crashloopbackoff.

Operator Information

  • Operator Version: 0.9.0
  • MongoDB Image used: 5.0.26

If possible, please include:

  • The operator logs
Running ./manager
2024-04-18T15:08:25.013Z	INFO	manager/main.go:74	Watching namespace: staging
I0418 15:08:26.063962      10 request.go:690] Waited for 1.037245763s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/imageregistry.operator.openshift.io/v1?timeout=32s
2024-04-18T15:08:28.669Z	INFO	manager/main.go:91	Registering Components.
2024-04-18T15:08:28.669Z	INFO	manager/main.go:104	Starting the Cmd.
2024-04-18T15:16:41.150Z	INFO	controllers/replica_set_controller.go:130	Reconciling MongoDB	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.151Z	DEBUG	controllers/replica_set_controller.go:132	Validating MongoDB.Spec	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.151Z	DEBUG	controllers/replica_set_controller.go:142	Ensuring the service exists	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.352Z	DEBUG	agent/replica_set_port_manager.go:122	No port change required	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.414Z	INFO	controllers/replica_set_controller.go:468	Create/Update operation succeeded	{"ReplicaSet": "staging/staging-mongodb", "operation": "created"}
2024-04-18T15:16:41.414Z	INFO	controllers/mongodb_tls.go:43	Ensuring TLS is correctly configured	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.414Z	INFO	controllers/mongodb_tls.go:86	Successfully validated TLS config	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.414Z	INFO	controllers/replica_set_controller.go:293	TLS is enabled, creating/updating CA secret	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.430Z	INFO	controllers/replica_set_controller.go:297	TLS is enabled, creating/updating TLS secret	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.438Z	DEBUG	controllers/replica_set_controller.go:400	Enabling TLS on a deployment with a StatefulSet that is not Ready, the Automation Config must be updated first	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.438Z	INFO	controllers/replica_set_controller.go:360	Creating/Updating AutomationConfig	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.438Z	DEBUG	scram/scram.go:128	No existing credentials found, generating new credentials
2024-04-18T15:16:41.438Z	DEBUG	scram/scram.go:106	Generating new credentials and storing in secret/root-scram2-scram-credentials
2024-04-18T15:16:41.561Z	DEBUG	scram/scram.go:117	Successfully generated SCRAM credentials
2024-04-18T15:16:41.561Z	DEBUG	scram/scram.go:128	No existing credentials found, generating new credentials
2024-04-18T15:16:41.561Z	DEBUG	scram/scram.go:106	Generating new credentials and storing in secret/metadata-scram2-scram-credentials
2024-04-18T15:16:41.637Z	DEBUG	scram/scram.go:117	Successfully generated SCRAM credentials
2024-04-18T15:16:41.854Z	DEBUG	agent/replica_set_port_manager.go:122	No port change required	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.854Z	DEBUG	agent/replica_set_port_manager.go:40	Calculated process port map: map[staging-mongodb-0:27017]	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.854Z	DEBUG	controllers/replica_set_controller.go:535	AutomationConfigMembersThisReconciliation	{"mdb.AutomationConfigMembersThisReconciliation()": 1}
2024-04-18T15:16:41.908Z	DEBUG	controllers/replica_set_controller.go:379	The existing StatefulSet did not have the readiness probe init container, skipping pod annotation check.	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.908Z	INFO	controllers/replica_set_controller.go:335	Creating/Updating StatefulSet	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.918Z	INFO	controllers/replica_set_controller.go:340	Creating/Updating StatefulSet for Arbiters	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.961Z	DEBUG	controllers/replica_set_controller.go:350	Ensuring StatefulSet is ready, with type: RollingUpdate	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.961Z	INFO	controllers/mongodb_status_options.go:110	ReplicaSet is not yet ready, retrying in 10 seconds
2024-04-18T15:16:41.981Z	INFO	controllers/replica_set_controller.go:130	Reconciling MongoDB	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.981Z	DEBUG	controllers/replica_set_controller.go:132	Validating MongoDB.Spec	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.981Z	DEBUG	controllers/replica_set_controller.go:142	Ensuring the service exists	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.982Z	DEBUG	agent/replica_set_port_manager.go:122	No port change required	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.986Z	INFO	controllers/replica_set_controller.go:468	Create/Update operation succeeded	{"ReplicaSet": "staging/staging-mongodb", "operation": "updated"}
2024-04-18T15:16:41.986Z	INFO	controllers/mongodb_tls.go:43	Ensuring TLS is correctly configured	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.986Z	INFO	controllers/mongodb_tls.go:86	Successfully validated TLS config	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.986Z	INFO	controllers/replica_set_controller.go:293	TLS is enabled, creating/updating CA secret	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:41.992Z	INFO	controllers/replica_set_controller.go:297	TLS is enabled, creating/updating TLS secret	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:42.097Z	DEBUG	controllers/replica_set_controller.go:400	Enabling TLS on a deployment with a StatefulSet that is not Ready, the Automation Config must be updated first	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:42.097Z	INFO	controllers/replica_set_controller.go:360	Creating/Updating AutomationConfig	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:42.106Z	DEBUG	scram/scram.go:101	Credentials have not changed, using credentials stored in: secret/root-scram2-scram-credentials
2024-04-18T15:16:42.114Z	DEBUG	scram/scram.go:101	Credentials have not changed, using credentials stored in: secret/metadata-scram2-scram-credentials
2024-04-18T15:16:42.114Z	DEBUG	agent/replica_set_port_manager.go:122	No port change required	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:42.114Z	DEBUG	agent/replica_set_port_manager.go:40	Calculated process port map: map[staging-mongodb-0:27017]	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:42.114Z	DEBUG	controllers/replica_set_controller.go:535	AutomationConfigMembersThisReconciliation	{"mdb.AutomationConfigMembersThisReconciliation()": 1}
2024-04-18T15:16:42.115Z	DEBUG	controllers/replica_set_controller.go:383	Waiting for agents to reach version 1	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:42.115Z	INFO	controllers/replica_set_controller.go:335	Creating/Updating StatefulSet	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:42.197Z	INFO	controllers/replica_set_controller.go:340	Creating/Updating StatefulSet for Arbiters	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:42.289Z	DEBUG	controllers/replica_set_controller.go:350	Ensuring StatefulSet is ready, with type: RollingUpdate	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:42.289Z	INFO	controllers/mongodb_status_options.go:110	ReplicaSet is not yet ready, retrying in 10 seconds
2024-04-18T15:16:42.303Z	INFO	controllers/replica_set_controller.go:130	Reconciling MongoDB	{"ReplicaSet": "staging/staging-mongodb"}
2024-04-18T15:16:42.303Z	DEBUG	controllers/replica_set_controller.go:132	Validating MongoDB.Spec	{"ReplicaSet": "staging/staging-mongodb"}
@laiminhtrung1997
Copy link

laiminhtrung1997 commented Apr 19, 2024

  1. You need to configure the field users in the mongodbcommunity custom resource.
  2. You need to modify the readinessProbe.initialDelaySeconds to 10 of container mongd.

@prachiwaghulkar
Copy link
Author

@laiminhtrung1997 Unfortunately, the readinessProbe still fails and pod goes in CrashLoopBackOff. Have provided readinessProbe.initialDelaySeconds as 10 to mongod container. users field was already configured in the mongodbcommunity CR.

Normal   Created         6m40s (x3 over 6m54s)  kubelet            Created container mongod
  Warning  Unhealthy       6m40s (x2 over 6m40s)  kubelet            Readiness probe failed:
  Normal   Started         6m39s (x3 over 6m54s)  kubelet            Started container mongod
  Warning  BackOff         111s (x25 over 6m51s)  kubelet            Back-off restarting failed container

@laiminhtrung1997
Copy link

Dear @prachiwaghulkar
Could you please provide the manifest of your mdbc cr?

@prachiwaghulkar
Copy link
Author

@laiminhtrung1997 PFB the mdbc cr manifest.

apiVersion: v1
items:
- apiVersion: mongodbcommunity.mongodb.com/v1
  kind: MongoDBCommunity
  metadata:
    name: staging-mongodb
    namespace: staging
  spec:
    additionalMongodConfig:
      net.maxIncomingConnections: 900
    featureCompatibilityVersion: "5.0"
    members: 1
    security:
      authentication:
        ignoreUnknownUsers: true
        modes:
        - SCRAM
      tls:
        caConfigMapRef:
          name: staging-mongodb-cert-ca-cm
        certificateKeySecretRef:
          name: staging-mongodb-cert
        enabled: true
    statefulSet:
      spec:
        template:
          spec:
            containers:
            - image: docker-na-public.artifactory.swg-devops.com/sec-guardium-next-gen-docker-local/mongo:5.0.26
              name: mongod
              readinessProbe:
                initialDelaySeconds: 10
              resources:
                limits:
                  cpu: "4"
                  ephemeral-storage: 5Gi
                  memory: 10Gi
                requests:
                  cpu: "1"
                  ephemeral-storage: 1Gi
                  memory: 2Gi
            imagePullSecrets:
            - name: ibm-entitlement-key
            initContainers:
            - name: mongodb-agent-readinessprobe
              resources:
                limits:
                  cpu: 100m
                  memory: 500Mi
                requests:
                  cpu: 6m
                  memory: 6Mi
            - name: mongod-posthook
              resources:
                limits:
                  cpu: 100m
                  memory: 500Mi
                requests:
                  cpu: 6m
                  memory: 6Mi
        volumeClaimTemplates:
        - apiVersion: v1
          kind: PersistentVolumeClaim
          metadata:
            name: data-volume
          spec:
            accessModes:
            - ReadWriteOnce
            resources:
              requests:
                storage: 100Gi
            storageClassName: rook-ceph-block
            volumeMode: Filesystem
        - apiVersion: v1
          kind: PersistentVolumeClaim
          metadata:
            name: logs-volume
          spec:
            accessModes:
            - ReadWriteOnce
            resources:
              requests:
                storage: 100Gi
            storageClassName: rook-ceph-block
            volumeMode: Filesystem
    type: ReplicaSet
    users:
    - db: admin
      name: root
      passwordSecretRef:
        key: mongodbRootPassword
        name: ibm-mongodb-authsecret
      roles:
      - db: admin
        name: clusterAdmin
      - db: admin
        name: userAdminAnyDatabase
      - db: admin
        name: readWriteAnyDatabase
      scramCredentialsSecretName: root-scram2
    - db: tnt_mbr_meta
      name: metadata
      passwordSecretRef:
        key: mongodbMetadataPassword
        name: ibm-mongodb-authsecret
      roles:
      - db: tnt_mbr_meta
        name: dbOwner
      scramCredentialsSecretName: metadata-scram2
    version: 5.0.26

@laiminhtrung1997
Copy link

The log of container mongodb-agent in mongodb-0 too, please.

@prachiwaghulkar
Copy link
Author

The log of mongodb-agent:

prachiwaghulkar@Prachis-MacBook-Pro cert-request % oc logs pod/staging-mongodb-0 -c mongodb-agent 
cat: /mongodb-automation/agent-api-key/agentApiKey: No such file or directory
[2024-04-19T05:31:54.604+0000] [.debug] [util/distros/distros.go:LinuxFlavorAndVersionUncached:144] Detected linux flavor ubuntu version 20.4

@laiminhtrung1997
Copy link

Hmmmm. My mdbc does not configure the TLS, and the MongoDB started without any errors. I have no idea. Sorry for cannot help you.

@prachiwaghulkar
Copy link
Author

prachiwaghulkar commented Apr 22, 2024

@irajdeep Can anybody from the community take a look and be able to assist here? It is important for us to move to 5.0.26

@nammn
Copy link
Collaborator

nammn commented Apr 22, 2024

@prachiwaghulkar can you please provide the agent logs and health logs as described here?
https://github.com/mongodb/mongodb-kubernetes-operator/blob/master/.github/ISSUE_TEMPLATE/bug_report.md

Having said that exec /bin/sh: exec format error seems like an architecture error. Are you running arm on amd or amd on arm? I suggest to change it to either and test it again.

@prachiwaghulkar
Copy link
Author

prachiwaghulkar commented Apr 22, 2024

@nammn I have used the following image: sha256:0172fb2a286d3dc9823f0e377587c0a545022bd330c817ed6b8bc231ea0643ad which is linux/amd64. We are updating from 5.0.24 to 5.0.26. 5.0.24 with amd worked fine for us.

PFB the logs:

Agent logs:

(venv) prachiwaghulkar@Prachis-MacBook-Pro ~ % kubectl exec -it staging-mongodb-0 -c mongodb-agent -- cat /var/log/mongodb-mms-automation/automation-agent.log        
[2024-04-22T06:23:30.847+0000] [header.info] [::0]        GitCommitId = 956e3386ad456471db1776d79637a38f182a6088
[2024-04-22T06:23:30.847+0000] [header.info] [::0]  AutomationVersion = 107.0.0.8465
[2024-04-22T06:23:30.847+0000] [header.info] [::0]          localhost = staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local
[2024-04-22T06:23:30.847+0000] [header.info] [::0] ErrorStateSleepTime = 10s
[2024-04-22T06:23:30.847+0000] [header.info] [::0] GoalStateSleepTime = 10s
[2024-04-22T06:23:30.847+0000] [header.info] [::0] NotGoalStateSleepTime = 1s
[2024-04-22T06:23:30.847+0000] [header.info] [::0]     PlanCutoffTime = 300000
[2024-04-22T06:23:30.847+0000] [header.info] [::0]       TracePlanner = false
[2024-04-22T06:23:30.847+0000] [header.info] [::0]               User = 2000
[2024-04-22T06:23:30.847+0000] [header.info] [::0]         Go version = go1.20.10
[2024-04-22T06:23:30.847+0000] [header.info] [::0]         MmsBaseURL = 
[2024-04-22T06:23:30.847+0000] [header.info] [::0]         MmsGroupId = 
[2024-04-22T06:23:30.847+0000] [header.info] [::0]          HttpProxy = 
[2024-04-22T06:23:30.847+0000] [header.info] [::0] DisableHttpKeepAlive = false
[2024-04-22T06:23:30.847+0000] [header.info] [::0]        HttpsCAFile = 
[2024-04-22T06:23:30.847+0000] [header.info] [::0] TlsRequireValidMMSServerCertificates = true
[2024-04-22T06:23:30.847+0000] [header.info] [::0] TlsMMSServerClientCertificate = 
[2024-04-22T06:23:30.847+0000] [header.info] [::0] KMIPProxyCertificateDir = /tmp
[2024-04-22T06:23:30.847+0000] [header.info] [::0] EnableLocalConfigurationServer = false
[2024-04-22T06:23:30.847+0000] [header.info] [::0] DialTimeoutSeconds = 40
[2024-04-22T06:23:30.847+0000] [header.info] [::0] KeepUnusedMongodbVersions = false
[2024-04-22T06:23:30.847+0000] [header.info] [::0] DisallowDowngrades = false
[2024-04-22T06:23:30.846+0000] [.error] [src/action/start.go:func1:145] [103] <staging-mongodb-0> [06:23:30.846] Error sleeping until process was up : <staging-mongodb-0> [06:23:30.846] Process staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local:27017 (local=false) has not come up despite waiting for 30m0s
[2024-04-22T06:23:30.846+0000] [.error] [src/director/director.go:executePlan:988] <staging-mongodb-0> [06:23:30.846] Failed to apply action. Result = <nil> : <staging-mongodb-0> [06:23:30.846] Error sleeping until process was up : <staging-mongodb-0> [06:23:30.846] Process staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local:27017 (local=false) has not come up despite waiting for 30m0s
[2024-04-22T06:23:30.846+0000] [.error] [src/director/director.go:planAndExecute:585] <staging-mongodb-0> [06:23:30.846] Plan execution failed on step StartFresh as part of move Start : <staging-mongodb-0> [06:23:30.846] Failed to apply action. Result = <nil> : <staging-mongodb-0> [06:23:30.846] Error sleeping until process was up : <staging-mongodb-0> [06:23:30.846] Process staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local:27017 (local=false) has not come up despite waiting for 30m0s
[2024-04-22T06:23:30.846+0000] [.error] [src/director/director.go:mainLoop:394] <staging-mongodb-0> [06:23:30.846] Failed to planAndExecute : <staging-mongodb-0> [06:23:30.846] Plan execution failed on step StartFresh as part of move Start : <staging-mongodb-0> [06:23:30.846] Failed to apply action. Result = <nil> : <staging-mongodb-0> [06:23:30.846] Error sleeping until process was up : <staging-mongodb-0> [06:23:30.846] Process staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local:27017 (local=false) has not come up despite waiting for 30m0s
[2024-04-22T06:50:56.215+0000] [header.info] [::0]        GitCommitId = 956e3386ad456471db1776d79637a38f182a6088
[2024-04-22T06:50:56.215+0000] [header.info] [::0]  AutomationVersion = 107.0.0.8465
[2024-04-22T06:50:56.215+0000] [header.info] [::0]          localhost = staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local
[2024-04-22T06:50:56.215+0000] [header.info] [::0] ErrorStateSleepTime = 10s
[2024-04-22T06:50:56.215+0000] [header.info] [::0] GoalStateSleepTime = 10s
[2024-04-22T06:50:56.215+0000] [header.info] [::0] NotGoalStateSleepTime = 1s
[2024-04-22T06:50:56.215+0000] [header.info] [::0]     PlanCutoffTime = 300000
[2024-04-22T06:50:56.215+0000] [header.info] [::0]       TracePlanner = false
[2024-04-22T06:50:56.215+0000] [header.info] [::0]               User = 2000
[2024-04-22T06:50:56.215+0000] [header.info] [::0]         Go version = go1.20.10
[2024-04-22T06:50:56.215+0000] [header.info] [::0]         MmsBaseURL = 
[2024-04-22T06:50:56.215+0000] [header.info] [::0]         MmsGroupId = 
[2024-04-22T06:50:56.215+0000] [header.info] [::0]          HttpProxy = 
[2024-04-22T06:50:56.215+0000] [header.info] [::0] DisableHttpKeepAlive = false
[2024-04-22T06:50:56.215+0000] [header.info] [::0]        HttpsCAFile = 
[2024-04-22T06:50:56.215+0000] [header.info] [::0] TlsRequireValidMMSServerCertificates = true
[2024-04-22T06:50:56.215+0000] [header.info] [::0] TlsMMSServerClientCertificate = 
[2024-04-22T06:50:56.215+0000] [header.info] [::0] KMIPProxyCertificateDir = /tmp
[2024-04-22T06:50:56.215+0000] [header.info] [::0] EnableLocalConfigurationServer = false
[2024-04-22T06:50:56.215+0000] [header.info] [::0] DialTimeoutSeconds = 40
[2024-04-22T06:50:56.215+0000] [header.info] [::0] KeepUnusedMongodbVersions = false
[2024-04-22T06:50:56.215+0000] [header.info] [::0] DisallowDowngrades = false
[2024-04-22T06:50:56.253+0000] [.error] [main/components/agent.go:ApplyClusterConfig:358] [06:50:56.253] Log path absent for process=state.ProcessConfigName=staging-mongodb-0ProcessType=mongodVersion=5.0.26FullVersion={"trueName":"5.0.26","gitVersion":"","modules":[],"major":5,"minor":0,"patch":26}Disabled=falseManualMode=falseNumCores=0CpuAffinity=[]LogRotate={"sizeThresholdMB":0,"timeThresholdHrs":0,"numUncompressed":0,"numTotal":0,"percentOfDiskspace":0,"includeAuditLogsWithMongoDBLogs":false}AuditLogRotate=<nil>LastResync="0001-01-01T00:00:00Z"LastThirdPartyRestoreResync="0001-01-01T00:00:00Z"LastCompact="0001-01-01T00:00:00Z"LastKmipMasterKeyRotation="0001-01-01T00:00:00Z"LastRestart="0001-01-01T00:00:00Z"Hostname=staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.localAlias=Cluster=AuthSchemaVersion=5FeatureCompatibilityVersion=5.0Kerberos=<nil>Args={"net":{"bindIp":"0.0.0.0","maxIncomingConnections":900,"port":27017,"tls":{"CAFile":"/var/lib/tls/ca/4971db0032afff31ab1235e283ef9ab7c9a4a483d630427923d253a41152cf13.pem","allowConnectionsWithoutCertificates":true,"certificateKeyFile":"/var/lib/tls/server/f131542c6e26217a9f960431d5177cd904c1a5661fd08482f4a194e836baa228.pem","mode":"requireTLS"}},"replication":{"replSetName":"staging-mongodb"},"setParameter":{"authenticationMechanisms":"SCRAM-SHA-256"},"storage":{"dbPath":"/data"}}ProcessAuthInfo={"UsersWanted":[{"user":"root","db":"admin","authenticationRestrictions":[],"scramSha1Creds":{"iterationCount":10000,"salt":"ztQzG8EXxgOT8qSloG6LfA==","storedKey":"+p4exhjiiYZoOIHahzi414ZINBs=","serverKey":"iV4qydBHQksSjyzTXDidhvn/9iY="},"scramSha256Creds":{"iterationCount":15000,"salt":"CcGExKTDjHefywe7CtG1VdfnqA9clT12VRz6MA==","storedKey":"JLVuWDSmdtJNNXRNVKf6Jw7MsofcbJP9G0N03N66Yb0=","serverKey":"d8R15D/XS9YVXwwDb6NjHBMCoYIrIxeUYU7PAK8tw7k="},"roles":[{"role":"clusterAdmin","db":"admin","minFcv":""},{"role":"readWriteAnyDatabase","db":"admin","minFcv":""},{"role":"userAdminAnyDatabase","db":"admin","minFcv":""}],"inheritedRoles":null,"mechanisms":[],"scope":null},{"user":"metadata","db":"tnt_mbr_meta","authenticationRestrictions":[],"scramSha1Creds":{"iterationCount":10000,"salt":"iTSe6nHUP2rYsv8XvRgnnA==","storedKey":"sh4Q4pq/+EnduxDhyLEaY6bix3Y=","serverKey":"sL7I88TKpiWOJcD1X2MJHxBIIAg="},"scramSha256Creds":{"iterationCount":15000,"salt":"1zfBNBYr0OXWlPpMdZsoark+HcMfxoX0MltBpQ==","storedKey":"uBZBpVzBawhgY1wp8p52UlTzAtpkOc3UEgKC7JGPwbU=","serverKey":"EacvAm/pNKMyUobWrb0aL8+Og3BJ/W174YVhLMn8SWU="},"roles":[{"role":"dbOwner","db":"tnt_mbr_meta","minFcv":""}],"inheritedRoles":null,"mechanisms":[],"scope":null}],"UsersDeleted":null,"Roles":null,"DesiredKey":"[ZpnXOsjiRNY-REDACTED-G04AKp5vLX0]","DesiredNewKey":"[ZpnXOsjiRNY-REDACTED-G04AKp5vLX0]","DesiredKeyHash":"KU8dVQoozhHdkGTMBh4UjQbqYTRFiyc9/juP3AbNnho=","DesiredNewKeyHash":null,"KeyfileHashes":["KU8dVQoozhHdkGTMBh4UjQbqYTRFiyc9/juP3AbNnho="],"UsingAuth":true}IsConfigServer=falseIsShardServer=falseIsInReplSet=trueIsStandalone=falseIsArbiter=falseDownloadBase=FullySyncRsTags=falseReplicaSetId=staging-mongodbBackupRestoreUrl=<redacted>, BackupRestoreUrlV3=BackupParallelRestoreUrl=BackupParallelRestoreNumChunks=0BackupParallelRestoreNumWorkers=0BackupThirdPartyRestoreBaseUrl=BackupRestoreRsVersion=0BackupRestoreElectionTerm=0BackupRestoreCheckpointTimestamp=<nil>BackupRestoreCertificateValidationHostname=BackupRestoreSystemUsersUUID=BackupRestoreSystemRolesUUID=BackupRestoreBalancerSettings=nullBackupRestoreConfigSettingsUUID=BackupShardIdRestoreMaps=[]DirectAttachVerificationKey=DirectAttachSourceClusterName=DirectAttachShouldFilterByFileList=falseConfigPath=StorageEngine=BackupRestoreOplogBaseUrl=BackupRestoreOplog=<nil>BackupRestoreDesiredTime=<nil>BackupRestoreSourceRsId=BackupRestoreFilterList=<nil>BackupRestoreFilteredFileListUrl=BackupRestoreJobId=BackupRestoreVerificationKey=BackupRestoreSourceGroupId=PitRestoreType=BackupThirdPartyOplogStoreType=EncryptionProviderType=KMIPProxyPort=0KMIPProxyDisabled=falseTemporaryPort=0CredentialsVersion=0Repair=nullRealtimeConfig=<nil>DataExplorerConfig=<nil>DefaultRWConcern=<nil>LdapCaPath=ConfigServers=[]RestartIntervalTimeMs=<nil>ClusterWideConfiguration=ProfilingConfig=<nil>RegionBaseUrl=RegionBaseRealtimeUrl=RegionBaseAgentUrl=StepDownPrimaryForResync=falsekey=<nil>keyLock=null. log destination=
[2024-04-22T06:50:56.254+0000] [.error] [src/main/cm.go:mainLoop:520] [06:50:56.254] Error applying desired cluster configs : [06:50:56.253] Log path absent for process=state.ProcessConfigName=staging-mongodb-0ProcessType=mongodVersion=5.0.26FullVersion={"trueName":"5.0.26","gitVersion":"","modules":[],"major":5,"minor":0,"patch":26}Disabled=falseManualMode=falseNumCores=0CpuAffinity=[]LogRotate={"sizeThresholdMB":0,"timeThresholdHrs":0,"numUncompressed":0,"numTotal":0,"percentOfDiskspace":0,"includeAuditLogsWithMongoDBLogs":false}AuditLogRotate=<nil>LastResync="0001-01-01T00:00:00Z"LastThirdPartyRestoreResync="0001-01-01T00:00:00Z"LastCompact="0001-01-01T00:00:00Z"LastKmipMasterKeyRotation="0001-01-01T00:00:00Z"LastRestart="0001-01-01T00:00:00Z"Hostname=staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.localAlias=Cluster=AuthSchemaVersion=5FeatureCompatibilityVersion=5.0Kerberos=<nil>Args={"net":{"bindIp":"0.0.0.0","maxIncomingConnections":900,"port":27017,"tls":{"CAFile":"/var/lib/tls/ca/4971db0032afff31ab1235e283ef9ab7c9a4a483d630427923d253a41152cf13.pem","allowConnectionsWithoutCertificates":true,"certificateKeyFile":"/var/lib/tls/server/f131542c6e26217a9f960431d5177cd904c1a5661fd08482f4a194e836baa228.pem","mode":"requireTLS"}},"replication":{"replSetName":"staging-mongodb"},"setParameter":{"authenticationMechanisms":"SCRAM-SHA-256"},"storage":{"dbPath":"/data"}}ProcessAuthInfo={"UsersWanted":[{"user":"root","db":"admin","authenticationRestrictions":[],"scramSha1Creds":{"iterationCount":10000,"salt":"ztQzG8EXxgOT8qSloG6LfA==","storedKey":"+p4exhjiiYZoOIHahzi414ZINBs=","serverKey":"iV4qydBHQksSjyzTXDidhvn/9iY="},"scramSha256Creds":{"iterationCount":15000,"salt":"CcGExKTDjHefywe7CtG1VdfnqA9clT12VRz6MA==","storedKey":"JLVuWDSmdtJNNXRNVKf6Jw7MsofcbJP9G0N03N66Yb0=","serverKey":"d8R15D/XS9YVXwwDb6NjHBMCoYIrIxeUYU7PAK8tw7k="},"roles":[{"role":"clusterAdmin","db":"admin","minFcv":""},{"role":"readWriteAnyDatabase","db":"admin","minFcv":""},{"role":"userAdminAnyDatabase","db":"admin","minFcv":""}],"inheritedRoles":null,"mechanisms":[],"scope":null},{"user":"metadata","db":"tnt_mbr_meta","authenticationRestrictions":[],"scramSha1Creds":{"iterationCount":10000,"salt":"iTSe6nHUP2rYsv8XvRgnnA==","storedKey":"sh4Q4pq/+EnduxDhyLEaY6bix3Y=","serverKey":"sL7I88TKpiWOJcD1X2MJHxBIIAg="},"scramSha256Creds":{"iterationCount":15000,"salt":"1zfBNBYr0OXWlPpMdZsoark+HcMfxoX0MltBpQ==","storedKey":"uBZBpVzBawhgY1wp8p52UlTzAtpkOc3UEgKC7JGPwbU=","serverKey":"EacvAm/pNKMyUobWrb0aL8+Og3BJ/W174YVhLMn8SWU="},"roles":[{"role":"dbOwner","db":"tnt_mbr_meta","minFcv":""}],"inheritedRoles":null,"mechanisms":[],"scope":null}],"UsersDeleted":null,"Roles":null,"DesiredKey":"[ZpnXOsjiRNY-REDACTED-G04AKp5vLX0]","DesiredNewKey":"[ZpnXOsjiRNY-REDACTED-G04AKp5vLX0]","DesiredKeyHash":"KU8dVQoozhHdkGTMBh4UjQbqYTRFiyc9/juP3AbNnho=","DesiredNewKeyHash":null,"KeyfileHashes":["KU8dVQoozhHdkGTMBh4UjQbqYTRFiyc9/juP3AbNnho="],"UsingAuth":true}IsConfigServer=falseIsShardServer=falseIsInReplSet=trueIsStandalone=falseIsArbiter=falseDownloadBase=FullySyncRsTags=falseReplicaSetId=staging-mongodbBackupRestoreUrl=<redacted>, BackupRestoreUrlV3=BackupParallelRestoreUrl=BackupParallelRestoreNumChunks=0BackupParallelRestoreNumWorkers=0BackupThirdPartyRestoreBaseUrl=BackupRestoreRsVersion=0BackupRestoreElectionTerm=0BackupRestoreCheckpointTimestamp=<nil>BackupRestoreCertificateValidationHostname=BackupRestoreSystemUsersUUID=BackupRestoreSystemRolesUUID=BackupRestoreBalancerSettings=nullBackupRestoreConfigSettingsUUID=BackupShardIdRestoreMaps=[]DirectAttachVerificationKey=DirectAttachSourceClusterName=DirectAttachShouldFilterByFileList=falseConfigPath=StorageEngine=BackupRestoreOplogBaseUrl=BackupRestoreOplog=<nil>BackupRestoreDesiredTime=<nil>BackupRestoreSourceRsId=BackupRestoreFilterList=<nil>BackupRestoreFilteredFileListUrl=BackupRestoreJobId=BackupRestoreVerificationKey=BackupRestoreSourceGroupId=PitRestoreType=BackupThirdPartyOplogStoreType=EncryptionProviderType=KMIPProxyPort=0KMIPProxyDisabled=falseTemporaryPort=0CredentialsVersion=0Repair=nullRealtimeConfig=<nil>DataExplorerConfig=<nil>DefaultRWConcern=<nil>LdapCaPath=ConfigServers=[]RestartIntervalTimeMs=<nil>ClusterWideConfiguration=ProfilingConfig=<nil>RegionBaseUrl=RegionBaseRealtimeUrl=RegionBaseAgentUrl=StepDownPrimaryForResync=falsekey=<nil>keyLock=null. log destination=
[2024-04-22T07:22:36.561+0000] [.error] [src/action/start.go:sleepUntilProcessUp:267] <staging-mongodb-0> [07:22:36.561] Process staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local:27017 (local=false) has not come up despite waiting for 30m0s
[2024-04-22T07:22:36.561+0000] [.error] [src/action/start.go:func1:145] [103] <staging-mongodb-0> [07:22:36.561] Error sleeping until process was up : <staging-mongodb-0> [07:22:36.561] Process staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local:27017 (local=false) has not come up despite waiting for 30m0s
[2024-04-22T07:22:36.561+0000] [.error] [src/director/director.go:executePlan:988] <staging-mongodb-0> [07:22:36.561] Failed to apply action. Result = <nil> : <staging-mongodb-0> [07:22:36.561] Error sleeping until process was up : <staging-mongodb-0> [07:22:36.561] Process staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local:27017 (local=false) has not come up despite waiting for 30m0s
[2024-04-22T07:22:36.561+0000] [.error] [src/director/director.go:planAndExecute:585] <staging-mongodb-0> [07:22:36.561] Plan execution failed on step StartFresh as part of move Start : <staging-mongodb-0> [07:22:36.561] Failed to apply action. Result = <nil> : <staging-mongodb-0> [07:22:36.561] Error sleeping until process was up : <staging-mongodb-0> [07:22:36.561] Process staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local:27017 (local=false) has not come up despite waiting for 30m0s
[2024-04-22T07:22:36.561+0000] [.error] [src/director/director.go:mainLoop:394] <staging-mongodb-0> [07:22:36.561] Failed to planAndExecute : <staging-mongodb-0> [07:22:36.561] Plan execution failed on step StartFresh as part of move Start : <staging-mongodb-0> [07:22:36.561] Failed to apply action. Result = <nil> : <staging-mongodb-0> [07:22:36.561] Error sleeping until process was up : <staging-mongodb-0> [07:22:36.561] Process staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local:27017 (local=false) has not come up despite waiting for 30m0s
[2024-04-22T07:54:17.873+0000] [.error] [src/action/start.go:sleepUntilProcessUp:267] <staging-mongodb-0> [07:54:17.873] Process staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local:27017 (local=false) has not come up despite waiting for 30m0s
[2024-04-22T07:54:17.873+0000] [.error] [src/action/start.go:func1:145] [103] <staging-mongodb-0> [07:54:17.873] Error sleeping until process was up : <staging-mongodb-0> [07:54:17.873] Process staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local:27017 (local=false) has not come up despite waiting for 30m0s
[2024-04-22T07:54:17.873+0000] [.error] [src/director/director.go:executePlan:988] <staging-mongodb-0> [07:54:17.873] Failed to apply action. Result = <nil> : <staging-mongodb-0> [07:54:17.873] Error sleeping until process was up : <staging-mongodb-0> [07:54:17.873] Process staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local:27017 (local=false) has not come up despite waiting for 30m0s
[2024-04-22T07:54:17.873+0000] [.error] [src/director/director.go:planAndExecute:585] <staging-mongodb-0> [07:54:17.873] Plan execution failed on step StartFresh as part of move Start : <staging-mongodb-0> [07:54:17.873] Failed to apply action. Result = <nil> : <staging-mongodb-0> [07:54:17.873] Error sleeping until process was up : <staging-mongodb-0> [07:54:17.873] Process staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local:27017 (local=false) has not come up despite waiting for 30m0s
[2024-04-22T07:54:17.873+0000] [.error] [src/director/director.go:mainLoop:394] <staging-mongodb-0> [07:54:17.873] Failed to planAndExecute : <staging-mongodb-0> [07:54:17.873] Plan execution failed on step StartFresh as part of move Start : <staging-mongodb-0> [07:54:17.873] Failed to apply action. Result = <nil> : <staging-mongodb-0> [07:54:17.873] Error sleeping until process was up : <staging-mongodb-0> [07:54:17.873] Process staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local:27017 (local=false) has not come up despite waiting for 30m0s

Health logs:

(venv) prachiwaghulkar@Prachis-MacBook-Pro ~ % kubectl exec -it staging-mongodb-0 -c mongodb-agent -- cat /var/log/mongodb-mms-automation/healthstatus/agent-health-status.json
{"statuses":{"staging-mongodb-0":{"IsInGoalState":false,"LastMongoUpTime":0,"ExpectedToBeUp":true,"ReplicationStatus":-1}},"mmsStatus":{"staging-mongodb-0":{"name":"staging-mongodb-0","lastGoalVersionAchieved":-1,"plans":[{"automationConfigVersion":1,"started":"2024-04-22T06:50:56.381946486Z","completed":null,"moves":[{"move":"Start","moveDoc":"Start the process","steps":[{"step":"StartFresh","stepDoc":"Start a mongo instance  (start fresh)","isWaitStep":false,"started":"2024-04-22T06:50:56.381976336Z","completed":null,"result":"error"}]},{"move":"WaitAllRsMembersUp","moveDoc":"Wait until all members of this process' repl set are up","steps":[{"step":"WaitAllRsMembersUp","stepDoc":"Wait until all members of this process' repl set are up","isWaitStep":true,"started":null,"completed":null,"result":""}]},{"move":"RsInit","moveDoc":"Initialize a replica set including the current MongoDB process","steps":[{"step":"RsInit","stepDoc":"Initialize a replica set","isWaitStep":false,"started":null,"completed":null,"result":""}]},{"move":"WaitFeatureCompatibilityVersionCorrect","moveDoc":"Wait for featureCompatibilityVersion to be right","steps":[{"step":"WaitFeatureCompatibilityVersionCorrect","stepDoc":"Wait for featureCompatibilityVersion to be right","isWaitStep":true,"started":null,"completed":null,"result":""}]}]}],"errorCode":0,"errorString":""}}}%  

@prachiwaghulkar
Copy link
Author

prachiwaghulkar commented Apr 23, 2024

@nammn Were you able to check the issue?

FYI, these are the mongo-agent, readinessprobe image that I am using.

     - image: mongodb/mongodb-agent
        mediaType: application/vnd.docker.distribution.manifest.v2
        digest: sha256:a208e80f79bb7fe954d9a9a1444bb482dee2e86e5e5ae89dbf240395c4a158b3
        tag: 107.0.0.8465-1
        platform:
          architecture: amd64
          os: linux
        registries:
          - host: quay.io
      - image: mongodb/mongodb-kubernetes-operator-version-upgrade-post-start-hook
        mediaType: application/vnd.docker.distribution.manifest.v2
        digest: sha256:08495e1331a1691878e449d971129ed17858a20a7b69bb74d2e84f057cfcc098
        tag: 1.0.8
        platform:
          architecture: amd64
          os: linux
        registries:
          - host: quay.io
      - image: mongodb/mongodb-kubernetes-operator
        mediaType: application/vnd.docker.distribution.manifest.v2
        digest: sha256:0aa26010be99caaf8a7dfd9cba81e326261ed99a69ac68b54aa8af3a104970bc
        tag: 0.9.0
        platform:
          architecture: amd64
          os: linux
        registries:
          - host: quay.io
      - image: mongodb/mongodb-kubernetes-readinessprobe
        mediaType: application/vnd.docker.distribution.manifest.v2
        digest: sha256:e84438c5394be7223de27478eb9066204d62e6ecd233d3d4e4c11d3da486a7b5
        tag: 1.0.17
        platform:
          architecture: amd64
          os: linux
        registries:
          - host: quay.io

@prachiwaghulkar
Copy link
Author

prachiwaghulkar commented Apr 24, 2024

@irajdeep @nammn Can you or somebody else take a look and assist here please? the mongodb worked fine for us till 5.0.24. I checked with mongodb 5.0.25 today and it errored with the same logs that I have shared above. So in short, we have been encountering this issue since the release 5.0.25!!

@sebt3
Copy link

sebt3 commented May 13, 2024

Having the exact same issue here. Fresh new instance.

apiVersion: mongodbcommunity.mongodb.com/v1
kind: MongoDBCommunity
metadata:
  creationTimestamp: "2024-05-13T09:11:50Z"
  generation: 1
  name: wildduck-wildduck-mongo
  namespace: solidite-mail
  resourceVersion: "192255239"
  uid: 7cfab3a7-30ac-434a-b65d-31b638229bde
spec:
  additionalMongodConfig:
    storage.wiredTiger.engineConfig.cacheSizeGB: 1
  members: 1
  security:
    authentication:
      ignoreUnknownUsers: true
      modes:
      - SCRAM
  statefulSet:
    spec:
      template:
        metadata:
          annotations:
            k8up.io/backupcommand: sh -c 'mongodump --username=$MONGODB_USER --password=$MONGODB_PASSWORD
              mongodb://localhost/$MONGODB_NAME --archive'
            k8up.io/file-extension: .archive
        spec:
          containers:
          - env:
            - name: MONGODB_NAME
              value: wildduck
            - name: MONGODB_USER
              value: wildduck
            - name: MONGODB_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: password
                  name: wildduck-wildduck-mongo
            imagePullPolicy: IfNotPresent
            name: mongod
            resources:
              limits:
                cpu: "1"
                memory: 1100M
              requests:
                cpu: "0.3"
                memory: 400M
  type: ReplicaSet
  users:
  - db: wildduck
    name: wildduck
    passwordSecretRef:
      name: wildduck-wildduck-mongo
    roles:
    - db: wildduck
      name: readWrite
    scramCredentialsSecretName: wildduck-wildduck-mongo-scram
  version: 6.0.13
status:
  currentMongoDBMembers: 0
  currentStatefulSetReplicas: 0
  message: ReplicaSet is not yet ready, retrying in 10 seconds
  mongoUri: ""
  phase: Pending

Describing the pod show the following errors :

  Warning  Unhealthy  21m (x3 over 21m)  kubelet            Readiness probe failed: panic: open /var/log/mongodb-mms-automation/healthstatus/agent-health-status.json: no such file or directory

goroutine 1 [running]:
main.main()
           /workspace/cmd/readiness/main.go:226 +0x191
  Warning  BackOff    11m (x46 over 21m)   kubelet  Back-off restarting failed container mongod in pod wildduck-wildduck-mongo-0_solidite-mail(ba2270f0-ecf0-4468-b01f-b7a5df538b4b)
  Warning  Unhealthy  76s (x223 over 21m)  kubelet  Readiness probe failed:

The pod logs contains nothing revelent

@nammn
Copy link
Collaborator

nammn commented Jun 13, 2024

@prachiwaghulkar can you verify that the mongodb image you are using is indeed compatible and working? Looking at the agent log it seems that the agent seems to wait forever and mongod and the related service is not up and running.;

can you somehow get a debug container running trying to access that service? I

staging-mongodb-0.staging-mongodb-svc.staging.svc.cluster.local:27017

@saksham1gupta
Copy link

saksham1gupta commented Jun 26, 2024

Facing same issue mongodb instance readiness probe failing for mongodb-agent container
MongoDB Community operator Version: community-operator-0.9.0
Openshift version: 4.14.25

apiVersion: mongodbcommunity.mongodb.com/v1
kind: MongoDBCommunity
metadata:
  name: mongodb-devops-test
  namespace: di-devops
spec:
  additionalConnectionStringConfig:
    readPreference: primary
  additionalMongodConfig:
    storage.wiredTiger.engineConfig.journalCompressor: zlib
  members: 3
  security:
    authentication:
      ignoreUnknownUsers: true
      modes:
        - SCRAM
  statefulSet:
    spec:
      selector:
        matchLabels:
          app.kubernetes.io/name: mongodb
      template:
        metadata:
          labels:
            app.kubernetes.io/name: mongodb
        spec:
          affinity:
            podAntiAffinity:
              preferredDuringSchedulingIgnoredDuringExecution:
                - podAffinityTerm:
                    labelSelector:
                      matchExpressions:
                        - key: app.kubernetes.io/name
                          operator: In
                          values:
                            - mongodb
                    topologyKey: kubernetes.io/hostname
                  weight: 100
          containers:
            - name: mongod
              resources:
                limits:
                  cpu: '0.2'
                  memory: 250M
                requests:
                  cpu: '0.2'
                  memory: 200M
            - name: mongodb-agent
              readinessProbe:
                failureThreshold: 40
                initialDelaySeconds: 5
                timeout: 30
              resources:
                limits:
                  cpu: '0.2'
                  memory: 250M
                requests:
                  cpu: '0.2'
                  memory: 200M
          initContainers:
            - name: mongodb-agent-readinessprobe
              resources:
                limits:
                  cpu: '2'
                  memory: 200M
                requests:
                  cpu: '1'
                  memory: 100M
  type: ReplicaSet
  users:
    - additionalConnectionStringConfig:
        readPreference: secondary
      db: didevops
      name: didevops
      passwordSecretRef:
        name: my-user-password
      roles:
        - db: didevops
          name: clusterAdmin
        - db: didevops
          name: userAdminAnyDatabase
        - db: didveops
          name: readWriteAnyDatabase
      scramCredentialsSecretName: my-scram
  version: 6.0.5
status:
  currentMongoDBMembers: 3
  currentStatefulSetReplicas: 3
  message: 'ReplicaSet is not yet ready, retrying in 10 seconds'
  mongoUri: 'mongodb://mongodb-devops-test-0.mongodb-devops-test-svc.di-devops.svc.cluster.local:27017,mongodb-devops-test-1.mongodb-devops-test-svc.di-devops.svc.cluster.local:27017,mongodb-devops-test-2.mongodb-devops-test-svc.di-devops.svc.cluster.local:27017/?replicaSet=mongodb-devops-test&readPreference=primary'
  phase: Pending
  version: 6.0.5

@veebkolm
Copy link

Ensure that your node has correct CPU model available. Mongo required AVX support. I didn't expose the CPU flag nor used the host CPU model passtrough, causing Mongo to not start.

@saksham1gupta
Copy link

Ensure that your node has correct CPU model available. Mongo required AVX support. I didn't expose the CPU flag nor used the host CPU model passtrough, causing Mongo to not start.

How can I ensure that node has correct CPU model available in openshift pod, is there any docs available or command which can help it supports?

@shubham-cmyk
Copy link

Any update on this I am facing the same issue

@veebkolm
Copy link

veebkolm commented Sep 9, 2024

In my case we had to pass host CPU model from Proxmox. Cloud providers should already pass the correct model.
lscpu | grep avx will show you whether your CPU supports AVX or not.

@CloudFocused
Copy link

For those on this thread, @veebkolm was absolutely right that it was a CPU flag for me. I am working Proxmox. To fix this I shelled into the host and had to pass the CPU features through to Kubernetes Nodes.

TO RESOLVE (For me):
cd /etc/pve/qemu-server/
nano .conf
edit the cpu line to be cpu:host
save
restart the VM.

Now all is right in the mongo world.

Copy link
Contributor

github-actions bot commented Dec 7, 2024

This issue is being marked stale because it has been open for 60 days with no activity. Please comment if this issue is still affecting you. If there is no change, this issue will be closed in 30 days.

@github-actions github-actions bot added the stale label Dec 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants