Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to EC disaster recovery + related KOTS snapshots docs updates #2916

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

paigecalvert
Copy link
Contributor

@paigecalvert paigecalvert commented Dec 20, 2024

  • Updated the Embedded Cluster DR topic to explain the new steps to add the Backup and Restore resources: https://deploy-preview-2916--replicated-docs.netlify.app/vendor/embedded-disaster-recovery

  • Updated the table in the Velero Backup resource topic for Snapshots to better show how the fields are nested (per Ethan's request): https://deploy-preview-2916--replicated-docs.netlify.app/reference/custom-resource-backup

  • Also reorged the Snapshots docs so all the vendor and end user info for snapshots is grouped together in a single section under KOTS:

    Screenshot 2025-01-06 at 11 12 28 AM

    ^ As part of this reorg, I also did the following:
    * Made sure that the snapshots topics all included "Snapshots" in their titles to avoid confusion with the Embedded Cluster disaster recovery stuff if you are searching the docs
    * Condensed the vendor snapshots overview and enterprise user snapshots overview topics into a single topic (we no longer need two separate topics for this now that it's all grouped in the same sidebar section)

@replicated-ci replicated-ci added type::docs Improvements or additions to documentation type::feature labels Dec 20, 2024
Copy link

netlify bot commented Dec 20, 2024

Deploy Preview for replicated-docs ready!

Name Link
🔨 Latest commit e4ae8c7
🔍 Latest deploy log https://app.netlify.com/sites/replicated-docs/deploys/677c20ac675f310008d2d9b0
😎 Deploy Preview https://deploy-preview-2916--replicated-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

netlify bot commented Dec 20, 2024

Deploy Preview for replicated-docs-upgrade ready!

Name Link
🔨 Latest commit e4ae8c7
🔍 Latest deploy log https://app.netlify.com/sites/replicated-docs-upgrade/deploys/677c20ac6350ba0008b5db1e
😎 Deploy Preview https://deploy-preview-2916--replicated-docs-upgrade.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@paigecalvert paigecalvert marked this pull request as ready for review January 2, 2025 15:42
@paigecalvert paigecalvert requested a review from a team as a code owner January 2, 2025 15:42
@@ -1,6 +1,6 @@
import NodeAgentMemLimit from "../partials/snapshots/_node-agent-mem-limit.mdx"

# Troubleshooting Backup and Restore
# Troubleshooting Snapshots
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ Updated various topic titles so that it's easier to see at a glance which ones apply to snapshots and which ones apply to EC DR

@@ -98,74 +93,76 @@ The following Velero fields are supported for full backups, as shown in the prev
<td>(Optional) Specifies the actions to perform at different times during a backup. The only supported hook is executing a command in a container in a pod (uses the pod exec API). Supports <code>pre</code> and <code>post</code> hooks.</td>
</tr>
<tr>
<td><code>resources</code></td>
<td><code>hooks.resources</code></td>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ updated the table per Ethan's feedback that it was hard to see how the given fields are nested


1. You must specify which Pod volumes you want backed up. This is done with the `backup.velero.io/backup-volumes` annotation. For more information, see [File System Backup](https://velero.io/docs/v1.14/file-system-backup/) in the Velero documentation.
1. In a new release containing your application files, add a Velero Backup resource. In the Backup resource, use namespace-based or label-based selection to indicate the application resources that you want to be included in the backup. For more information, see [Backup API Type](https://velero.io/docs/latest/api-types/backup/) in the Velero documentation.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ clarify that both namespaces and labels can be used

# the name of the Backup resource that you added
backupName: backup
includedNamespaces:
- '*'
Copy link
Contributor Author

@paigecalvert paigecalvert Jan 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ I also pulled this Restore resource from what Ethan had provided in the story.

Wasn't sure if the includedNamespaces field could be removed (according to the velero docs: "If unspecified, all namespaces are included.")

@@ -96,11 +124,17 @@ To enable disaster recovery for a customer:

When your customer installs with Embedded Cluster, Velero will be deployed if the **Allow Disaster Recovery** license field is enabled.

## Configure Backup Storage and Take Backups in the Admin Console
## Take Backups and Restore
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ added this new h2 to nest the procedures for setting up backup storage and restoring from a backup. Felt nice to break up the topic into separate subsections that amount to: "configuring and enabling the feature" and "using the feature"

@@ -328,16 +328,6 @@ const sidebars = {
'vendor/packaging-air-gap-excluding-minio',
],
},
{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edited the snapshots info in the sidebar so that both the vendor info about end user info are in the same section to make it all easier to find


* The disaster recovery feature flag must be enabled for your account. To get access to disaster recovery, reach out to Alex Parker at [[email protected]](mailto:[email protected]).
* Embedded Cluster version 1.4.1 or later
* Embedded Cluster version **X.X.X** or later
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ is there a new version that we should put here now that this topic talks about the new "add a Restore resource" method?


* If the `--admin-console-port` flag was used during install to change the port for the Admin Console, note that during a restore the Admin Console port will be used from the backup and cannot be changed. For more information, see [Embedded Cluster Install Command Options](/reference/embedded-cluster-install).

## Configure Disaster Recovery for Your Application
## Configure Disaster Recovery
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ split this section into 2 subsections: configure the velero resources and then enable dr for customers

@@ -34,44 +36,68 @@ Embedded Cluster disaster recovery has the following limitations and known issue

[View a larger version of this image](/images/ec-version-command.png)

* You can only restore from the most recent backup.
* Any Helm extensions included in the `extensions` field of the Embedded Cluster Config are _not_ included in backups. Helm extensions are reinstalled as part of the restore process. To include Helm extensions in backups, configure the Velero Backup resource to include the extensions using namespace-based or label-based selection. For more information, see [Configure the Velero Custom Resources](#config-velero-resources) below.
Copy link
Contributor Author

@paigecalvert paigecalvert Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ add limitation that extensions aren't included

Not sure if we want to include this part:

To include Helm extensions in backups, configure the Velero Backup resource to include the extensions using namespace-based or label-based selection.

orLabelSelectors:
- matchExpressions:
# Exclude Replicated resources from the backup
- { key: kots.io/kotsadm, operator: NotIn, values: ["true"] }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ this is the example that Ethan had provided in the related eng story

1. You must specify which Pod volumes you want backed up. This is done with the `backup.velero.io/backup-volumes` annotation. For more information, see [File System Backup](https://velero.io/docs/v1.14/file-system-backup/) in the Velero documentation.
:::important
If you use namespace-based selection to include all of your application resources deployed in the `kotsadm` namespace, ensure that you exclude the Replicated resources that are also deployed in the `kotsadm` namespace. Because the Embedded Cluster infrastructure components are always included in backups automatically, this avoids duplication.
:::
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ give people a heads up that they need to exclude replicated resource if they use namespace-based selection.

I put "duplication" as the reasoning here, but there might be a better way to explain it

Disaster recovery for Embedded Cluster installations is implemented with Velero. For more information about Velero, see the [Velero](https://velero.io/docs/v1.14/) documentation.
The backups that your customers take from the Admin Console will include both the Embedded Cluster infrastructure and the application resources that you specify.

The Embedded Cluster infrastructure that is backed up includes components such as the KOTS Admin Console and the built-in registry that is deployed for air gap installations. No configuration is required to include Embedded Cluster infrastructure in backups. Vendors specify the application resources to include in backups by configuring a Velero Backup resource in the application release.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ updated the overview to explain what's included in the backups


# About Backing Up and Restoring with Snapshots
# About Backup and Restore with Snapshots
Copy link
Contributor Author

@paigecalvert paigecalvert Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ note: previously, we had a vendor snapshots overview and an enterprise user snapshot overview as two different topics. I grouped that content together under this single overview topic:

docs/enterprise/snapshots-understanding.mdx + docs/vendor/snapshots-overview.mddocs/vendor/snapshots-overview.mdx

Everything here was copy and pasted from that existing content


[[redirects]]
from="https://docs.replicated.com/enterprise/snapshots-understanding"
to="https://docs.replicated.com/vendor/snapshots-overview"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ redirect the deleted topic

@paigecalvert paigecalvert changed the title Updates to EC disaster recovery Updates to EC disaster recovery + related KOTS snapshots docs updates Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type::docs Improvements or additions to documentation type::feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants