Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster is considered unhealthy if some nodes are unreachable #48

Closed
slavaboiko opened this issue Dec 21, 2016 · 6 comments
Closed

Cluster is considered unhealthy if some nodes are unreachable #48

slavaboiko opened this issue Dec 21, 2016 · 6 comments

Comments

@slavaboiko
Copy link
Contributor

The ClusterHealth check is considered unhealthy if some nodes are unreachable even if the configured consistencyLevel can be satisfied.

Our app can is healthy and able to serve clients, but cqlmigrate prevents it from starting if the cluster is considered unhealthy.

@jsravn
Copy link
Contributor

jsravn commented Dec 21, 2016

@v-Boiko It should only do that check if there are new migrations. This is necessary since schema updates in cassandra generally require all nodes to be healthy to prevent data loss (we've had issues in the past with an old node disagreeing on schema). There has already been some discussion around this, see #35 and #37.

@jsravn
Copy link
Contributor

jsravn commented Dec 21, 2016

I'm okay with adding an option to ignore health on schema migrate, but it is very much do-at-your-own-risk type of thing since it can lead to a few catastrophic situations. Better to have the whole cluster up if possible when doing schema changes - although I realise that is probably not viable for large clusters (>100 nodes).

@jsravn
Copy link
Contributor

jsravn commented Dec 21, 2016

#42 is also related - cqlmigrate will erroneously treat dead nodes as unhealthy.

@slavaboiko
Copy link
Contributor Author

Probably the approach is doing right now is valid. We checked and we didn't have the required migrations in the schema_updates table, so the library actually was trying to do something.
Probably the best is just to sort out the cluster problems first.

Do you think we still need to acquire the lock then if the cluster is unhealthy?

@jsravn
Copy link
Contributor

jsravn commented Dec 21, 2016

It isn't necessary to acquire if unhealthy, but the same thing is accomplished by ensuring we unlock if an exception is thrown.

@jsravn
Copy link
Contributor

jsravn commented Feb 15, 2018

I believe this is fixed in #53.

@jsravn jsravn closed this as completed Feb 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants