Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i have a question about nodes #2

Open
jalpino-talo opened this issue Aug 27, 2022 · 0 comments
Open

i have a question about nodes #2

jalpino-talo opened this issue Aug 27, 2022 · 0 comments

Comments

@jalpino-talo
Copy link

Hello, first of all, thank you very much for the code, excellent project, I followed the readme to deploy the eks and everything works in principle, I manage to connect to the cluster and visualize the nodes but they appear with notReady status and the pods do not start.

According to what I was reading some logs, it seems to me that it is related to the vpc that does not connect, I think because of permissions. I leave you the logs that I have

9 node_lifecycle_controller.go:868] Node ip-10-0-82-117.us-east-2.compute.internal is NotReady as of 2022-08-27 04:29:02.250300267 +0000 UTC m=+440.833299. Adding it to the Taint queue.
I0827 04:29:02.250306 9 node_lifecycle_controller.go:868] Node ip-10-0-82-117.us-east-2.compute.internal is NotReady as of 2022-08-27 04:29:02.250300267 +0000 UTC m=+440.443833299. Adding it to the Taint queue.

No authorization-kubeconfig provided, so SubjectAccessReview of authorization tokens won't work.
W0827 04:26:05.630675 10 authorization.go:193] No authorization-kubeconfig provided, so SubjectAccessReview of authorization tokens won't work.

Zone not specified in configuration file; querying AWS metadata service
I0827 04:26:05.636204 10 aws.go:1297] Zone not specified in configuration file; querying AWS metadata service

error retrieving resource lock kube-system/cloud-controller-manager: Get "ht8.18:443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/cloud-controller-manager? timeout=5s": dial tcp 172.16.48.18:443: connect: connection refused
E0827 04:26:05.678379 10 leaderelection.go:330] error retrieving resource lock kube-system/cloud-controller-manager: Get "https://172.16.48.18:443/apis/coordination.k8s.io/v1/namespaces /kube-system/leases/cloud-controller-manager?timeout=5s": dial tcp 172.16.48.18:443: connect: connection refused

"Unable to schedule pod; no fit; waiting" pod="kube-system/coredns-5948f55769-fd6lq" err="0/2 nodes are available: 2 node(s) had taint {node.kubernetes.io/not- ready: }, that the pod didn't tolerate."
I0827 04:33:18.463511 11 factory.go:209] "Unable to schedule pod; no fit; waiting" pod="kube-system/coredns-5948f55769-fd6lq" err="0/2 nodes are available: 2 node( s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate."

"Unable to schedule pod; no fit; waiting" pod="kube-system/coredns-5948f55769-fd6lq" err="0/2 nodes are available: 2 node(s) had taint {node.kubernetes.io/not- ready: }, that the pod didn't tolerate."
I0827 04:36:18.468766 11 factory.go:209] "Unable to schedule pod; no fit; waiting" pod="kube-system/coredns-5948f55769-fd6lq" err="0/2 nodes are available: 2 node( s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate."

"Unable to schedule pod; no nodes are registered to the cluster; waiting" pod="kube-system/coredns-5db97b446d-p5ksx"
I0827 01:43:17.749524 11 factory.go:205] "Unable to schedule pod; no nodes are registered to the cluster; waiting" pod="kube-system/coredns-5db97b446d-p5ksx"

"Removed node in listed group from NodeTree" node="10.240.79.157" zone=""

Error in getting instanceID for node 10.240.79.157, error: Invalid format for AWS instance ()
E0827 01:19:12.931004 11 tagging_controller.go:221] Error in getting instanceID for node 10.240.79.157, error: Invalid format for AWS instance ()

0 actual_state_of_world.go:539] Failed to update statusUpdateNeeded field in actual state of world: Failed to set statusUpdateNeeded to needed true, because nodeName="10.240.79.157" does not exist
W0827 01:19:11.701642 10 actual_state_of_world.go:539] Failed to update statusUpdateNeeded field in actual state of world: Failed to set statusUpdateNeeded to needed true, because nodeName="10.240.79.157" does not exist

FOR ME THIS IS THE MOST RELATED ERROR

node_lifecycle_controller.go:868] Node ip-10-0-113-236.us-east-2.compute.internal is NotReady as of 2022-08-27 01:16:10.065968407 +0000 UTC m=+30464.742746419. Adding it to the Taint queue.
I0827 01:16:10.065973 10 node_lifecycle_controller.go:868] Node ip-10-0-113-236.us-east-2.compute.internal is NotReady as of 2022-08-27 01:16:10.065968407 +0000 UTC m=+30464.742746419. Adding it to the Taint queue.

I can also tell you that I had to add the keypair resource and add it to the group because the one you leave in the project gives me an error that it does not exist. Maybe that is related to not granting the correct permissions

I also have a script that activates the cluster logs in cloudwatch that I can leave you in a PR

Add a code to be able to manage the cluster with an sso user that I can leave you in a PR.

thanks,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant