From ef121befff2acbb968d51b8941da7dc5fae983f0 Mon Sep 17 00:00:00 2001 From: EnigmaCurry Date: Tue, 15 Oct 2024 23:41:40 -0600 Subject: [PATCH] backup-volume --- books/portable-docker.org | 186 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 186 insertions(+) diff --git a/books/portable-docker.org b/books/portable-docker.org index c4303861a..932e4922a 100644 --- a/books/portable-docker.org +++ b/books/portable-docker.org @@ -5429,6 +5429,192 @@ index :EXPORT_HUGO_WEIGHT: 9002 :END: +Native systems are harder to backup than virtual systems, thats just +the way it goes. With VMs, your hypervisor usually has a builtin +backup feature which pauses the machine, makes a snapshot, resumes the +machine, and continues backing up the snapshot in the background. If +you have the luxury of running your server as a VM (e.g., [[https://www.proxmox.com/en/proxmox-virtual-environment/overview][Proxmox +PVE]]), it makes backups so much easier. + +However in this book we are dealing with a physical Raspberry Pi, not +a virtual machine, so backup must be managed another way. + +*** Restic + +[[https://restic.net/][Restic]] is a modern open source backup tool, which has many great +features including incremental backups, encryption, and offsite +upload. It can maintain backups of huge sizes. + +You can install restic to backup any directory on a Linux host. One of +the best ways to do that is with the script found on this blog post: + +[[https://blog.rymcg.tech/blog/linux/restic_backup/][Daily backups to S3 with Restic and systemd timers]] + +The only problem with backing up files this way is that with +containers that are always running, you need to make sure that the +files are flushed to disk before the backup starts, otherwise your +backup could become corrupted. For most media storage that doesnt +change often (photos, videos, etc.) this might not be such a big deal, +but for database volumes it's a problem. + +*** Backup-Volume + +[[https://github.com/EnigmaCurry/backup-volume][Backup-Volume]] is another backup tool that is specifically configured +to backup Docker volumes and uploading archives to offsite storage +(S3, SSH, DropBox). This tool is much more simplistic compared to +Restic, with the most important difference being that *Backup-Volume +can only handle complete backups (no incremental storage)*. For small +datasets, this is ideal, as each backup gets stored in a separate +=backup-XXXX.tar.gz=, and its easy to restore with one file. For +larger datasets, this duplication would be prohibitively +expensive/wasteful. + +Backup-Volume has a trick it can use in its favor: it can +automatically stop and start containers before and after the backup +runs. This makes this style of backup much safer for write intensive +volumes (e.g., databases) and ensures that the data gets flushed +before the backup starts. + +You will have to analyze your own situation and weigh the cost of +data integrity vs. the cost of data duplication, to help decide which +kind of backup to deploy. + +*** Setup Backup-Volume + +**** Prepare an S3 bucket offsite + +You may want to use your own [[/portable-docker/install-web-services/minio-s3/index.html][minio]] S3 service (preferably installed on +a separate offsite server), or a third party provider (AWS S3, +DigitalOcean Spaces, Wasabi, etc.) + +You will need to provide the S3 bucket and credentials that the backup +process will use when uploading archives: + + * S3 Endpoint domain. e.g., =s3.example.com=. + * S3 bucket name. e.g., =test= + * S3 access key id. e.g., =test=. + * S3 secret key. e.g., =xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx=. + +**** Configure Backup-Volume + +#+attr_shortcode: :style secondary :title Run this on your Raspberry Pi +#+begin_run +## Configures the default backup-volume instance: +pi make backup-volume config +#+end_run + +Select multiple existing volumes to backup together as one archive: + +#+begin_stdout +? Select all the volumes to backup +> [x] test1_data + [ ] forgejo_data + [x] icecast_config + [ ] icecast_logs + [ ] mosquitto_mosquitto + [ ] traefik_geoip_database +v [ ] traefik_traefik +#+end_stdout + +Choose the backup schedule in [[https://github.com/EnigmaCurry/d.rymcg.tech/blob/73648904e5a954e17077368c299a23a19947ab16/backup-volume/.env-dist#L23-L59][cron format]] : + +#+begin_stdout +BACKUP_CRON_EXPRESSION: Enter the cron expression (eg. @daily) +: @every 24h +#+end_stdout + +#+attr_shortcode: :style tip :title Tip +#+begin_notice +Other example schedules: + + * =@every 1h15m= + * =@daily= + * =@weekly= + * [[https://github.com/EnigmaCurry/d.rymcg.tech/blob/19ee7e0e5e39350b86ec6317c4e1d3765c806378/backup-volume/.env-dist#L23-L59][See more in the .env-dist file]] +#+end_notice + +Choose the retention length (number of days) to keep backup archives +before automatic pruning happens: + +#+begin_stdout +BACKUP_RETENTION_DAYS: Rotate backups older than how many days? (eg. 30) +: 30 +#+end_stdout + +You can choose any of the supported storage mechanisms. For demo +purposes, choose S3: + +#+begin_stdout +> Which remote storage do you want to use? s3 + +BACKUP_AWS_ENDPOINT: Enter the S3 endpoint (e.g., s3.example.com) +: s3.d.example.com +BACKUP_AWS_S3_BUCKET_NAME: Enter the S3 bucket name (e.g., my-bucket) +: backup-test-1 +BACKUP_AWS_ACCESS_KEY_ID: Enter the S3 access key id (e.g., my-access-key) +: backup-test-1 +BACKUP_AWS_SECRET_ACCESS_KEY: Enter the S3 secret access key +: OEuL3lMSdvdoFyVjEQTM4Trj/7VhHq7Q7cOFEpQPuxMHxsTVK3Hxne7st6Ty +BACKUP_AWS_S3_PATH: Choose a directory inside the bucket (blank for root) +: + +#+end_stdout + +You may optionally preserve an additional copy of the archive in a +local volume: + +#+begin_stdout +> Do you want to keep a local backup in addition to the remote one? No +#+end_stdout + +**** Install + +#+attr_shortcode: :style secondary :title Run this on your Raspberry Pi +#+begin_run +## installs the default backup instance: +pi make backup-volume install +#+end_run + +**** Instances + +All volume selections will backup to the same archive on the same +schedule. To back up different volumes, on different schedules, you +should create more than one instance of Backup-Volume to create +separate configs: + +#+attr_shortcode: :style secondary :title Run this on your Raspberry Pi +#+begin_run +## Creates a new backup instance named test: +pi make backup-volume instance instance=test +pi make backup-volume install instance=test +#+end_run + +**** Verify backup schedule + +#+attr_shortcode: :style secondary :title Run this on your Raspberry Pi +#+begin_run +pi make backup-volume logs +#+end_run + +#+begin_stdout +backup-1 | 2024-10-16T02:37:00.263838944Z time=2024-10-16T02:37:00.262Z level=INFO msg="Successfully scheduled backup from environment with expression @daily" +backup-1 | 2024-10-16T02:37:00.266773318Z time=2024-10-16T02:37:00.266Z level=INFO msg="The backup will start at 12:00 AM" +#+end_stdout + +#+attr_shortcode: :style tip :title Tip +#+begin_notice +You should see a plain text log message describing when the backup +will occur (=The backup will start at 12:00 AM=), except it will be +ommitted if you use the =@every= syntax. +#+end_notice + +**** Restore + +To restore a volume from a backup, simply untar the archive into the +appropriate directory under =/var/lib/docker/volumes=. + +**** Notifications + TODO ** Upgrade