Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pacemaker: remove node on delete (SOC-11240) #376

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sandonovsuse
Copy link

On node delete, crowbar doesn't execute pacemaker barclamp due to
default pacemaker proposal not having transitions attribute enabled
for delete. This change adds said feature and the delete transition.

stonith_mode = proposal["attributes"]["pacemaker"]["stonith"]["mode"]

# remove node from stonith cfg
[stonith_mode, "per_node"].each do |key|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need it for current mode AND "per_node"? I have SBD STONITH in my env and I don't have "per_node" under "stonith".
Even if we decide to keep it, it should be guarded in case the key doesn't exist.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I had in it my env, didn't think it might be optional. Thanks

@sandonovsuse sandonovsuse force-pushed the pacemaker-remove-node-on-delete-bsc-1170479 branch from 421e108 to e726e5c Compare August 27, 2020 14:57
@sandonovsuse sandonovsuse force-pushed the pacemaker-remove-node-on-delete-bsc-1170479 branch from e726e5c to 08b5f16 Compare August 27, 2020 15:30
@sandonovsuse sandonovsuse force-pushed the pacemaker-remove-node-on-delete-bsc-1170479 branch from 08b5f16 to 25d4627 Compare August 27, 2020 15:35
skazi0
skazi0 previously approved these changes Aug 27, 2020
Copy link
Member

@skazi0 skazi0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good but I'm not 100% sure if it's safe to commit from inside the transition.

@jgrassler
Copy link

Thanks for pointing out the "Forget" operation. It's behavior does not appear to be affected though: with this patch in place, all occurences of the forgotten node are purged from the Pacemaker proposal upon saving it post-forget. But the same thing happens without this patch in place. So where does this come into play/what is the expected change in behavior?

@skazi0
Copy link
Member

skazi0 commented Oct 28, 2020

@jgrassler the forgotten nodes are removed from the list by UI/CLI and this would be persisted if pacemaker proposal was saved post-forget. The problem here is that if you just "forget" and leave the cloud running, at some point the periodic chef-client runs will start to fail because there is no new founder elected. This change (together with the crowbar-core one) is supposed to make this "save & commit" step trigger automatically after "forget".

@skazi0
Copy link
Member

skazi0 commented Oct 29, 2020

@sandonovsuse I think I know why it didn't work for @jgrassler. I also had similar results in my tests. The problem is that you changed proposal schema but didn't add a migration for this. You probably tested with new proposal (after adding your changes) but it would not work for existing proposals.

This should make it usable for existing proposals:

diff --git a/chef/data_bags/crowbar/migrate/pacemaker/302_add_delete_transition.rb b/chef/data_bags/crowbar/migrate/pacemaker/302_add_delete_transition.rb
new file mode 100644
index 0000000..f2f0a4d
--- /dev/null
+++ b/chef/data_bags/crowbar/migrate/pacemaker/302_add_delete_transition.rb
@@ -0,0 +1,11 @@
+def upgrade(template_attrs, template_deployment, attrs, deployment)
+  deployment["config"]["transition_list"] = template_deployment["config"]["transition_list"]
+  deployment["config"]["transitions"] = template_deployment["config"]["transitions"]
+  return attrs, deployment
+end
+
+def downgrade(template_attrs, template_deployment, attrs, deployment)
+  deployment["config"]["transition_list"] = template_deployment["config"]["transition_list"]
+  deployment["config"]["transitions"] = template_deployment["config"]["transitions"]
+  return attrs, deployment
+end
diff --git a/chef/data_bags/crowbar/template-pacemaker.json b/chef/data_bags/crowbar/template-pacemaker.json
index 668f102..ca06cbb 100644
--- a/chef/data_bags/crowbar/template-pacemaker.json
+++ b/chef/data_bags/crowbar/template-pacemaker.json
@@ -65,7 +65,7 @@
     "pacemaker": {
       "crowbar-revision": 0,
       "crowbar-applied": false,
-      "schema-revision": 301,
+      "schema-revision": 302,
       "element_states": {
         "pacemaker-cluster-member"    : [ "readying", "ready", "applying" ],
         "hawk-server"   

On node delete, crowbar doesn't execute pacemaker barclamp due to
default pacemaker proposal not having transitions attribute enabled
for delete. Next, pacemaker needs to remove the node from the cluster
prior to being deleted from crowbar. This change adds said feature.
Migartion also added.
@sandonovsuse sandonovsuse force-pushed the pacemaker-remove-node-on-delete-bsc-1170479 branch from 25d4627 to 6e1a375 Compare October 29, 2020 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants