Update workload cluster restore instruction

aws · Oct 24, 2023 · 4c8cc1f · 4c8cc1f
1 parent 54b5a39
commit 4c8cc1f
Showing 1 changed file with 84 additions and 15 deletions.
diff --git a/docs/content/en/docs/clustermgmt/cluster-backup-restore/restore-cluster.md b/docs/content/en/docs/clustermgmt/cluster-backup-restore/restore-cluster.md
@@ -36,7 +36,9 @@ If the cluster is no longer accessible in any means, or the infrastructure machi
 
 1. Create a new management cluster to which you will be migrating your workload clusters later.
 
-    You can define a cluster config similar to your old management cluster, with a different cluster name (if the old management cluster still exists), and run cluster creation of the new management cluster with the **exact same EKS Anywhere version** used to create the old management cluster.
+    You can define a cluster config similar to your old management cluster, and run cluster creation of the new management cluster with the **exact same EKS Anywhere version** used to create the old management cluster.
+
+    If the original management cluster still exists with old infrastructure running, you need to create a new management cluster with a **different cluster name** to avoid conflict.
 
     ```sh
     eksctl anywhere create cluster -f mgmt-new.yaml
@@ -48,9 +50,11 @@ If the cluster is no longer accessible in any means, or the infrastructure machi
 
 
     ```bash
+    # Use the same cluster name if the newly created management cluster has the same cluster name as the old one
     MGMT_CLUSTER_OLD="mgmt-old"
     MGMT_CLUSTER_NEW="mgmt-new"
     MGMT_CLUSTER_NEW_KUBECONFIG=${MGMT_CLUSTER_NEW}/${MGMT_CLUSTER_NEW}-eks-a-cluster.kubeconfig
+    
     WORKLOAD_CLUSTER_1="w01"
     WORKLOAD_CLUSTER_2="w02"
 
@@ -86,13 +90,13 @@ If the cluster is no longer accessible in any means, or the infrastructure machi
         --to-kubeconfig ${MGMT_CLUSTER_NEW_KUBECONFIG}
     ```
 
-1. Edit the cluster config file of the workload clusters.
+1. (Optional) Update the cluster config file of the workload clusters if the new management cluster has a different cluster name than the original management cluster.
 
-    You need to update the cluster config file of all the workload clusters to point to the new management cluster created in step 1. For example,
-
-    workload cluster w01
+    You can **skip this step** if the new management cluster has the same cluster name as the old management cluster.
 
     ```yaml
+    # workload cluster w01
+    ---
     apiVersion: anywhere.eks.amazonaws.com/v1alpha1
     kind: Cluster
     metadata:
@@ -104,9 +108,9 @@ If the cluster is no longer accessible in any means, or the infrastructure machi
       ...
     ```
 
-    workload cluster w02
-
     ```yaml
+    # workload cluster w02
+    ---
     apiVersion: anywhere.eks.amazonaws.com/v1alpha1
     kind: Cluster
     metadata:
@@ -120,7 +124,7 @@ If the cluster is no longer accessible in any means, or the infrastructure machi
 
     Make sure that apart from the `managementCluster` field you updated above, all the other cluster configs of the workload clusters need to stay the same as the old workload clusters resources after the old management cluster fails.
 
-1. Apply the updated cluster config of each workload cluster in the new management cluster.
+    Apply the updated cluster config of each workload cluster in the new management cluster.
 
     ```bash
     MGMT_CLUSTER_NEW="mgmt-new"
@@ -161,16 +165,81 @@ If the cluster is no longer accessible in any means, or the infrastructure machi
 
 Similar to the failed management cluster without infrastructure components change situation, follow the [External etcd backup and restore]({{< relref "../etcd-backup-restore/etcdbackup" >}}) to restore the workload cluster itself from the backup.
 
-{{% alert title="Warning" color="warning" %}}
+### Cluster not accessible or infrastructure components changed after etcd backup was taken
 
-Do not apply the etcd restore unless you are very sure that the infrastructure layer is not changed after the etcd backup was taken. In other words, the nodes, machines, VMs, and their assigned IPs need to be exactly the same as when the backup was taken.
+If the workload cluster is still accessible, but the infrastructure machines are changed after the etcd backup was taken, you can still try restoring the cluster itself from the etcd backup. Although doing so is risky: it can potentially cause the node names, IPs and other infrastructure configurations to revert to a state that is no longer valid. Restoring etcd effectively takes a cluster back in time and all clients will experience a conflicting, parallel history. This can impact the behavior of watching components like Kubernetes controller managers, EKS Anywhere cluster controller manager, and Cluster API controller managers. You may need to manually update the CAPI infrastructure objects, such as the infra VMs and machines to use the existing or latest configurations in order to bring the workload cluster back to a healthy state.
 
-{{% /alert %}}
+If the original workload cluster is no longer accessible in any means, or restoring the workload cluster itself from the outdated etcd does not bring the cluster back to healthy state, then you need to create a new workload cluster based of the same management cluster that managed the original workload cluster, and restore all your workload applications from the etcd backup of the original workload cluster to the new one, so that the same management cluster can maintain the ownership of managing the new workload cluster, with all the same data from the old workload cluster. Below is an example of applying the etcd backup `etcd-snapshot-w01.db` of a failed workload cluster `w01` to a new workload cluster `w02`:
 
-### Cluster not accessible or infrastructure components changed after etcd backup was taken
 
-During a workload cluster upgrade, if all the control plane nodes get rolled out but the upgrade fails during worker nodes upgrade, a simple etcd restore will not work, since doing a restore would cause the node names, IPs and potentially other infrastructure configurations to revert to a state that is no longer valid. Similarly, when the workload cluster is completely inaccessible, restoring etcd in a newly created workload cluster will not work due to the mismatch between the new and old clusters' node spec. 
+1. Create a new workload cluster to which you will be migrating your workloads and applications from the original failed workload cluster.
+
+    You can define a cluster config similar to your old workload cluster, with a different cluster name (if the old workload cluster still exists), and run cluster creation of the new workload cluster with the **exact same EKS Anywhere version** used to create the old workload cluster.
+
+    ```bash
+    eksctl anywhere create cluster -f w02.yaml --kubeconfig $MGMT_CLUSTER_KUBECONFIG
+    ```
+
+1. Follow the [External etcd backup and restore]({{< relref "../etcd-backup-restore/etcdbackup" >}}) to restore the old workload cluster's etcd backup `etcd-snapshot-w01.db` onto the new workload cluster `w02`.
 
-Restoring etcd effectively takes a cluster back in time and all clients will experience a conflicting, parallel history. This can impact the behavior of watching components like Kubernetes controller managers, EKS Anywhere cluster controller manager, and Cluster API controller managers. Etcd restore is only suitable if you lose only your etcd cluster and want to recover your data, or revert your own deployments to the previous state and nothing else in the infrastructure layer (nodes specifically) has changed.
+    Use different restore process based on OS family:
+    * [BottleRocket]({{< relref "../etcd-backup-restore/bottlerocket-etcd-backup/#restore-etcd-from-backup" >}})
+    * [Ubuntu]({{< relref "../etcd-backup-restore/ubuntu-rhel-etcd-backup/#restore" >}})
 
-Therefore under this extreme circumstance, you may need to manually update the CAPI infrastructure objects, such as the infra VMs and machines to use the existing or latest configurations in order to bring the workload cluster back to a healthy state.
+    You might notice that after restoring the original etcd backup to the new workload cluster `w02`, all the node names have prefix `w01-*` and the nodes go to `NotReady` state. This is because restoring etcd effectively applies the node data from the original cluster which causes a conflicting history and can impact the behavior of watching components like Kubelets, Kubernetes controller managers.
+
+    ```bash
+    kubectl get nodes --kubeconfig $WORKLOAD_CLUSTER_2_KUBECONFIG
+
+    NAME                              STATUS     ROLES           AGE     VERSION
+    w01-bbtdd                         NotReady   control-plane   3d23h   v1.27.3-eks-6f07bbc
+    w01-md-0-66dbcfb56cxng8lc-8ppv5   NotReady   <none>          3d23h   v1.27.3-eks-6f07bbc
+    ```
+
+    ```bash
+    kubectl describe node w01-bbtdd --kubeconfig $WORKLOAD_CLUSTER_2_KUBECONFIG
+
+    Name:               w01-bbtdd
+    ...
+    Conditions:
+      Type             Status    LastHeartbeatTime                 LastTransitionTime                Reason              Message
+      ----             ------    -----------------                 ------------------                ------              -------
+      MemoryPressure   Unknown   Mon, 09 Oct 2023 21:55:58 +0000   Mon, 09 Oct 2023 22:34:40 +0000   NodeStatusUnknown   Kubelet stopped posting node status.
+      DiskPressure     Unknown   Mon, 09 Oct 2023 21:55:58 +0000   Mon, 09 Oct 2023 22:34:40 +0000   NodeStatusUnknown   Kubelet stopped posting node status.
+      PIDPressure      Unknown   Mon, 09 Oct 2023 21:55:58 +0000   Mon, 09 Oct 2023 22:34:40 +0000   NodeStatusUnknown   Kubelet stopped posting node status.
+      Ready            Unknown   Mon, 09 Oct 2023 21:55:58 +0000   Mon, 09 Oct 2023 22:34:40 +0000   NodeStatusUnknown   Kubelet stopped posting node status.
+    Events:
+      Type    Reason          Age    From             Message
+      ----    ------          ----   ----             -------
+      Normal  RegisteredNode  9m32s  node-controller  Node w01-bbtdd event: Registered Node w01-bbtdd in Controller
+      Normal  NodeNotReady    8m52s  node-controller  Node w01-bbtdd status is now: NodeNotReady
+    ```
+
+1. Restart Kubelet of the control plane and worker nodes of the workload cluster `w02`.
+
+    In order to bring back the nodes to ready state, you need to manually restart the Kubelet on all the control plane and worker nodes. Kubelet registers the node itself with the apisever which then updates etcd with the correct node data of the new workload cluster `w02`.
+
+    {{< tabpane >}}
+    {{< tab header="Ubuntu or RHEL" lang="bash" >}}
+# SSH into the control plane and worker nodes. You must do this for each node.
+ssh -i ${SSH_KEY} ${SSH_USERNAME}@<node IP>
+sudo su
+systemctl restart kubelet
+    {{< /tab >}}
+    {{< tab header="Bottlerocket" lang="bash" >}}
+# SSH into the control plane and worker nodes. You must do this for each node.
+ssh -i ${SSH_KEY} ${SSH_USERNAME}@<node IP>
+apiclient exec admin bash
+sheltie
+systemctl restart kubelet
+    {{< /tab >}}
+    {{< /tabpane >}}
+
+1. Validate the nodes are in ready state.
+
+    ```bash
+    kubectl get nodes --kubeconfig $WORKLOAD_CLUSTER_2_KUBECONFIG
+    NAME                              STATUS   ROLES    AGE     VERSION
+    w02-djshz                         Ready    <none>   9m7s    v1.27.3-eks-6f07bbc
+    w02-md-0-6bbc8dd6d4xbgcjh-wfmb6   Ready    <none>   3m55s   v1.27.3-eks-6f07bbc
+    ```