Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gke-cluster TF task creates a cluster empty node pool #8

Open
nikkatalnikov opened this issue Sep 19, 2021 · 5 comments
Open

gke-cluster TF task creates a cluster empty node pool #8

nikkatalnikov opened this issue Sep 19, 2021 · 5 comments

Comments

@nikkatalnikov
Copy link

nikkatalnikov commented Sep 19, 2021

Hi there,

When trying to create GKE cluster, I see the following log at the end:

google_container_cluster.gke-cluster: Creation complete after 6m10s [id=projects/***/locations/europe-west1-b/clusters/chainlink]

However, I observe the cluster nodes being automatically deleted after seemingly successful creation.
The final result is like this:

Screenshot 2021-09-19 at 17 13 52

after thay I got

│ Error: namespaces "chainlink" not found
│ 
│   with kubernetes_secret.password-credentials,
│   on chainlink-node.tf line 85, in resource "kubernetes_secret" "password-credentials":
│   85: resource "kubernetes_secret" "password-credentials" {
│ 
╵
╷
│ Error: Failed to create deployment: namespaces "chainlink" not found
│ 
│   with kubernetes_deployment.chainlink-node,
│   on chainlink-node.tf line 99, in resource "kubernetes_deployment" "chainlink-node":
│   99: resource "kubernetes_deployment" "chainlink-node" {
│ 

I tried both master and feature/tf-upgrade versions.

Could it be IAM issue?

Thank you!

@nikkatalnikov
Copy link
Author

The tooltip on UI says

The number of nodes is estimated by the number of Compute VM instances because the Kubernetes control plane did not respond, possibly due to a pending upgrade or missing IAM permissions.

The number of nodes in a cluster should match the number of Compute VM instances, except for:
A temporary skew during resize or upgrade
Uncommon configurations in which nodes or instances were manipulated directly with Kubernetes and/or Compute APIs

@nikkatalnikov
Copy link
Author

ok, seems like few appropriate depends_on fix the problems.

for chainlink image v 0.9.10 pods fail to start:

Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  20m                 default-scheduler  0/3 nodes are available: 3 node(s) had taint {node.kubernetes.io/network-unavailable: }, that the pod didn't tolerate.
  Normal   Scheduled         20m                 default-scheduler  Successfully assigned chainlink/chainlink-75dd5b6bdf-g8l87 to gke-chainlink-main-nodes-964c9c9f-smjb
  Warning  FailedMount       20m                 kubelet            MountVolume.SetUp failed for volume "api-volume" : failed to sync secret cache: timed out waiting for the condition
  Normal   Pulling           20m                 kubelet            Pulling image "smartcontract/chainlink:0.9.10"
  Normal   Pulled            20m                 kubelet            Successfully pulled image "smartcontract/chainlink:0.9.10" in 17.493417828s
  Normal   Created           18m (x5 over 20m)   kubelet            Created container chainlink-node
  Normal   Started           18m (x5 over 20m)   kubelet            Started container chainlink-node

@nikkatalnikov
Copy link
Author

I also observe an error logs in the pods:

2021-09-20T00:07:16Z [FATAL] Unable to initialize ORM: pq: syntax error at or near "INCLUDE"
error running migrations
github.com/smartcontractkit/chainlink/core/store/migrations.MigrateTo
	/chainlink/core/store/migrations/migrate.go:549
github.com/smartcontractkit/chainlink/core/store/migrations.Migrate
	/chainlink/core/store/migrations/migrate.go:523
github.com/smartcontractkit/chainlink/core/store.initializeORM.func1
	/chainlink/core/store/store.go:223
github.com/smartcontractkit/chainlink/core/store/orm.(*ORM).RawDB
	/chainlink/core/store/orm/orm.go:1443

@nikkatalnikov
Copy link
Author

ok, with postgres 13.3 chainlink 0.9.10 works.

I won't make a PR as it looks to be redundant to #7 - most things are the same.

@Pega88 feel free to close an issue once that PR is merged.

@blackramit
Copy link

All the exact issues I ran into Nik. I have put in a commit to Niels for an update. Also, I wasn't able to get anything above 0.9.10 to work so that appears to be the newest Chainlink release that can be used with this method.

If you and Niels are interested, I ran into this really well done deployment using Helm from Leo Vigna at Vulcan. I'm just learning to work with K8's and I'm using Chainlink as my app platform on GKE, Really cool stuff that folks are doing and I can't thank people like Niels and Leo enough for sharing their knowledge!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants