Skip to content

Commit

Permalink
backup-volume
Browse files Browse the repository at this point in the history
  • Loading branch information
EnigmaCurry committed Oct 16, 2024
1 parent 0c28f05 commit ef121be
Showing 1 changed file with 186 additions and 0 deletions.
186 changes: 186 additions & 0 deletions books/portable-docker.org
Original file line number Diff line number Diff line change
Expand Up @@ -5429,6 +5429,192 @@ index
:EXPORT_HUGO_WEIGHT: 9002
:END:

Native systems are harder to backup than virtual systems, thats just
the way it goes. With VMs, your hypervisor usually has a builtin
backup feature which pauses the machine, makes a snapshot, resumes the
machine, and continues backing up the snapshot in the background. If
you have the luxury of running your server as a VM (e.g., [[https://www.proxmox.com/en/proxmox-virtual-environment/overview][Proxmox
PVE]]), it makes backups so much easier.

However in this book we are dealing with a physical Raspberry Pi, not
a virtual machine, so backup must be managed another way.

*** Restic

[[https://restic.net/][Restic]] is a modern open source backup tool, which has many great
features including incremental backups, encryption, and offsite
upload. It can maintain backups of huge sizes.

You can install restic to backup any directory on a Linux host. One of
the best ways to do that is with the script found on this blog post:

[[https://blog.rymcg.tech/blog/linux/restic_backup/][Daily backups to S3 with Restic and systemd timers]]

The only problem with backing up files this way is that with
containers that are always running, you need to make sure that the
files are flushed to disk before the backup starts, otherwise your
backup could become corrupted. For most media storage that doesnt
change often (photos, videos, etc.) this might not be such a big deal,
but for database volumes it's a problem.

*** Backup-Volume

[[https://github.com/EnigmaCurry/backup-volume][Backup-Volume]] is another backup tool that is specifically configured
to backup Docker volumes and uploading archives to offsite storage
(S3, SSH, DropBox). This tool is much more simplistic compared to
Restic, with the most important difference being that *Backup-Volume
can only handle complete backups (no incremental storage)*. For small
datasets, this is ideal, as each backup gets stored in a separate
=backup-XXXX.tar.gz=, and its easy to restore with one file. For
larger datasets, this duplication would be prohibitively
expensive/wasteful.

Backup-Volume has a trick it can use in its favor: it can
automatically stop and start containers before and after the backup
runs. This makes this style of backup much safer for write intensive
volumes (e.g., databases) and ensures that the data gets flushed
before the backup starts.

You will have to analyze your own situation and weigh the cost of
data integrity vs. the cost of data duplication, to help decide which
kind of backup to deploy.

*** Setup Backup-Volume

**** Prepare an S3 bucket offsite

You may want to use your own [[/portable-docker/install-web-services/minio-s3/index.html][minio]] S3 service (preferably installed on
a separate offsite server), or a third party provider (AWS S3,
DigitalOcean Spaces, Wasabi, etc.)

You will need to provide the S3 bucket and credentials that the backup
process will use when uploading archives:

* S3 Endpoint domain. e.g., =s3.example.com=.
* S3 bucket name. e.g., =test=
* S3 access key id. e.g., =test=.
* S3 secret key. e.g., =xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx=.

**** Configure Backup-Volume

#+attr_shortcode: :style secondary :title Run this on your Raspberry Pi
#+begin_run
## Configures the default backup-volume instance:
pi make backup-volume config
#+end_run

Select multiple existing volumes to backup together as one archive:

#+begin_stdout
? Select all the volumes to backup
> [x] test1_data
[ ] forgejo_data
[x] icecast_config
[ ] icecast_logs
[ ] mosquitto_mosquitto
[ ] traefik_geoip_database
v [ ] traefik_traefik
#+end_stdout

Choose the backup schedule in [[https://github.com/EnigmaCurry/d.rymcg.tech/blob/73648904e5a954e17077368c299a23a19947ab16/backup-volume/.env-dist#L23-L59][cron format]] :

#+begin_stdout
BACKUP_CRON_EXPRESSION: Enter the cron expression (eg. @daily)
: @every 24h
#+end_stdout

#+attr_shortcode: :style tip :title Tip
#+begin_notice
Other example schedules:

* =@every 1h15m=
* =@daily=
* =@weekly=
* [[https://github.com/EnigmaCurry/d.rymcg.tech/blob/19ee7e0e5e39350b86ec6317c4e1d3765c806378/backup-volume/.env-dist#L23-L59][See more in the .env-dist file]]
#+end_notice

Choose the retention length (number of days) to keep backup archives
before automatic pruning happens:

#+begin_stdout
BACKUP_RETENTION_DAYS: Rotate backups older than how many days? (eg. 30)
: 30
#+end_stdout

You can choose any of the supported storage mechanisms. For demo
purposes, choose S3:

#+begin_stdout
> Which remote storage do you want to use? s3

BACKUP_AWS_ENDPOINT: Enter the S3 endpoint (e.g., s3.example.com)
: s3.d.example.com
BACKUP_AWS_S3_BUCKET_NAME: Enter the S3 bucket name (e.g., my-bucket)
: backup-test-1
BACKUP_AWS_ACCESS_KEY_ID: Enter the S3 access key id (e.g., my-access-key)
: backup-test-1
BACKUP_AWS_SECRET_ACCESS_KEY: Enter the S3 secret access key
: OEuL3lMSdvdoFyVjEQTM4Trj/7VhHq7Q7cOFEpQPuxMHxsTVK3Hxne7st6Ty
BACKUP_AWS_S3_PATH: Choose a directory inside the bucket (blank for root)
:

#+end_stdout

You may optionally preserve an additional copy of the archive in a
local volume:

#+begin_stdout
> Do you want to keep a local backup in addition to the remote one? No
#+end_stdout

**** Install

#+attr_shortcode: :style secondary :title Run this on your Raspberry Pi
#+begin_run
## installs the default backup instance:
pi make backup-volume install
#+end_run

**** Instances

All volume selections will backup to the same archive on the same
schedule. To back up different volumes, on different schedules, you
should create more than one instance of Backup-Volume to create
separate configs:

#+attr_shortcode: :style secondary :title Run this on your Raspberry Pi
#+begin_run
## Creates a new backup instance named test:
pi make backup-volume instance instance=test
pi make backup-volume install instance=test
#+end_run

**** Verify backup schedule

#+attr_shortcode: :style secondary :title Run this on your Raspberry Pi
#+begin_run
pi make backup-volume logs
#+end_run

#+begin_stdout
backup-1 | 2024-10-16T02:37:00.263838944Z time=2024-10-16T02:37:00.262Z level=INFO msg="Successfully scheduled backup from environment with expression @daily"
backup-1 | 2024-10-16T02:37:00.266773318Z time=2024-10-16T02:37:00.266Z level=INFO msg="The backup will start at 12:00 AM"
#+end_stdout

#+attr_shortcode: :style tip :title Tip
#+begin_notice
You should see a plain text log message describing when the backup
will occur (=The backup will start at 12:00 AM=), except it will be
ommitted if you use the =@every= syntax.
#+end_notice

**** Restore

To restore a volume from a backup, simply untar the archive into the
appropriate directory under =/var/lib/docker/volumes=.

**** Notifications

TODO

** Upgrade
Expand Down

0 comments on commit ef121be

Please sign in to comment.