Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heml deployment adds probe detection to prevent cluster splitting caused by rolling updates #268

Open
kuaile-zc opened this issue Nov 24, 2021 · 3 comments

Comments

@kuaile-zc
Copy link
Contributor

kuaile-zc commented Nov 24, 2021

Problem description:
When we used Heml deplyment nacos in K8s, we found that rolling updates can cause a number of problems such as cluster splitting.
So we analyze the reasons, we find the following rules.
Because the rolling update does not add probe verification, a short period of time during the update will cause all the PODS to be suspended !
We added probes and tested the method successfully in a K8S environment ,So we would like to ask if we can come up with PR to help all community partners avoid this problem !

See:
1.Snapshots taken when the cluster status is abnormal may cause Service MetaData and Instance data synchronization problems.

图片

图片

  1. PR Add startupProbe and livenessProbe #269
    3.We plan to communicate if nacos add cluster health check access
    图片
    In order to increase self-healing ability

当我们在k8中使用Heml部署nacos时,我们发现滚动更新会导致许多问题,比如集群分裂。
所以我们分析原因,我们发现以下规则。
由于滚动更新没有添加探针验证,更新过程中很短的一段时间将导致所有pod杀死!
我们在K8S环境下添加了探头并成功测试了该方法,所以我们想问一下,我们是否可以想出PR来帮助所有社区合作伙伴避免这个问题!

如下:

  1. 在集群状态异常时进行快照,可能会导致“服务元数据”和“实例数据同步”出现问题。
  2. PR Add startupProbe and livenessProbe #269
  3. 如果naos添加集群健康检查访问,我们计划进行沟通
@kuaile-zc
Copy link
Contributor Author

related issues:
#214

@kuaile-zc
Copy link
Contributor Author

To see if we can create a new interface to check if the current Pod is in a normal cluster state, we used livenessProbe to check this interface, which solved JRaft's inability to heal when all instances were killed.

@kuaile-zc
Copy link
Contributor Author

@paderlol

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant