Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration in a docker-compose.yaml not fully working as expected #347

Open
Srungweber opened this issue Sep 17, 2024 · 10 comments
Open

Comments

@Srungweber
Copy link

Srungweber commented Sep 17, 2024

I have this setup using docker compose:

services:
db:
image: mysql:8.4.2
restart: always
environment:
MYSQL_DATABASE: 'xxxxxxxx'
# Password for root access
MYSQL_ROOT_PASSWORD: ${DB_PASSWORD}
MYSQL_ROOT_HOST: "%"
ports:
# :
- '3306:3306'
expose:
# Opens port 3306 on the container
- '3306'
# Where our data will be persisted
volumes:
- my-db:/var/lib/mysql

backup:
image: databack/mysql-backup:master
restart: always
command: dump
volumes:
- ~/database/backups:/db_backup
environment:
DB_DUMP_TARGET: /db_backup
DB_USER: root
DB_PASS: ${DB_PASSWORD}
DB_DUMP_CRON: 0 * * * *
DB_SERVER: db
DB_DUMP_BY_SCHEMA: false
DB_NAMES:
volumes:
my-db:

Every full hour a backup is being successfully created.

  1. "DB_DUMP_BY_SCHEMA: false"
  • Expected: A single .sql file containing all the data.
  • Actual: The backup contains a new sql file for each schema.
  1. "DB_NAMES:" docs
  • Expected: all schemas incl. system schemas are in the backup
  • Actual: Only the user created schemas are being backed up

I also tried with this environment definition without any luck:

environment:
- DB_DUMP_TARGET=/db_backup
- DB_USER=root
- DB_PASS=${DB_PASSWORD}
- DB_DUMP_CRON=40 * * * *
- DB_SERVER=db
- DB_DUMP_BY_SCHEMA=false
- DB_NAMES=

Hardware a M1 Mac with macOS 14.1

@deitch
Copy link
Collaborator

deitch commented Sep 17, 2024

Fairly or not, the docs for DB_DUMP_BY_SCHEMA are leftovers from the old script-based version. The current version always dumps by schema into separate files, then tars them together (and gzips, configurable).

If you provide specific db names to backup it does those, else it finds all of them here, which does:

  1. show databases
  2. exclude any that match one of "information_schema", "performance_schema", "sys", "mysql"

The original version of mysql-backup always had a single dump file, then people made really good arguments for having separate backup files. When v1.x came along, we couldn't think of any good reason to keep (and support, and detect) both formats, so it all became file-per-schema.

If there is a good case to be made for needing a single .sql file, we can look at it. What would not be covered by calling restore on the 5 files extracted from a tgz file vs a single one?

@Srungweber
Copy link
Author

Thank you for providing this insight. To be honest I did not check if the docs are still valid.
For the single sql file: We don't have a case to make here. Having a separate file per schema is great for us.

For the schemas to exclude the system ones is more of an issue to us as we don't want to recreate all the user account in a recovery scenario. Is this a no-no?
It leaves us with the option of explicitly naming these system schemas in the configuration. Which in and of itself is okay.
But this also means that all user schemas need to be named in the configuration, no? So every time a schema is being added someone needs to remember to add the new schema to the backup configuration.

@deitch
Copy link
Collaborator

deitch commented Sep 17, 2024

No it's actually a decent suggestion, to have an option to back up the system tables as well. Leave this open, will try and get to to. If you've got golang skills and want to contribute, that's always welcome.

Historically, if I recall correctly, mysqldump excludes system tables, which is why this does too. I have a vague memory of seeing mysqldump source code.

@Srungweber
Copy link
Author

Maybe you don't need to. I could easily create a second job that explicitly names the system schemas that I want in a backup and have the other run without DB_NAMES so it covers all current and future user created schemas.

@deitch
Copy link
Collaborator

deitch commented Sep 17, 2024

Yeah, but that is a painful UX and requires lots of extra work. We will get to it.

@deitch
Copy link
Collaborator

deitch commented Sep 18, 2024

The easy part for this - CLI flag/config/env var, plus controls to ensure that those tables do not get filtered out - is done on a branch. The hard part, what to do with it, is creating some complexity.

Is there any point to dumping the tables in these databases?

  • performance_schema is transient, information on the current performance, and not helpful to restore to another database. Actually, I wonder if it would break something if you did. I wouldn't want to try.
  • information_schema is transient as well, just in-memory metadata
  • sys which is mostly views into those above and related things, also, my understanding, is not meant to be dumped and restored
  • mysql - this is a somewhat reasonable candidate, with its users and other information

So, what are we actually talking about?

@Srungweber
Copy link
Author

From where I stand today, I would have a use case for mysql.user
Maybe I'm missing something but that would be my initial request.
Let me play devil's advocate for a moment, though:
User data can also be stored in a sql file and made available as part of the docker-compose up procedure.
This would eliminate the need to include the user table in backup.

Downside to this: when passwords get rolled, the user sql file would need to be updated, too.

You see, I'm still a bit on the fence about my own idea.
Do you remember what the original argument was to exclude the system schemas?

@deitch
Copy link
Collaborator

deitch commented Sep 18, 2024

Do you remember what the original argument was to exclude the system schemas?

Sure, mysqldump didn't do it. 😁

The original version of this was just a wrapper around mysqldump, so it inherited a lot. As we switched to a native version, we wanted to change only those things that really made sense to do so.

Also, implementing each table type was real work in the native version. Worth it, but work nonetheless.

To be fair, mysqldump didn't back them up for good reasons, mainly those listed above. If you read their docs, they say that if you want to dump those, ask for them explicitly. IIRC, they also recommend against it (but working off of memory).

User data can also be stored in a sql file and made available as part of the docker-compose up procedure.
This would eliminate the need to include the user table in backup

You could do something like this using post-backup scripts, if you really wanted to. In any case, all database backups are snapshots in time, valid only at the moment the backup was taken.

But I wouldn't want to make any knowledge of that inherent to mysql-backup itself. It is too fraught with danger.

I see a few other paths:

  • we can have --system-database, but dump just sys. I am pasting a copy of everything in sys in another message
  • we can have --users to dump just sys.users. That gets a little bit messy, but not terribly so.

I could not comment on the risks of trying to restore mysql table to another database. I just do not know. Tech tools are like knives, cut food well, but they have edges.

@deitch
Copy link
Collaborator

deitch commented Sep 18, 2024

mysql> show full tables;
+------------------------------------------------------+------------+
| Tables_in_mysql                                      | Table_type |
+------------------------------------------------------+------------+
| columns_priv                                         | BASE TABLE |
| component                                            | BASE TABLE |
| db                                                   | BASE TABLE |
| default_roles                                        | BASE TABLE |
| engine_cost                                          | BASE TABLE |
| func                                                 | BASE TABLE |
| general_log                                          | BASE TABLE |
| global_grants                                        | BASE TABLE |
| gtid_executed                                        | BASE TABLE |
| help_category                                        | BASE TABLE |
| help_keyword                                         | BASE TABLE |
| help_relation                                        | BASE TABLE |
| help_topic                                           | BASE TABLE |
| innodb_index_stats                                   | BASE TABLE |
| innodb_table_stats                                   | BASE TABLE |
| ndb_binlog_index                                     | BASE TABLE |
| password_history                                     | BASE TABLE |
| plugin                                               | BASE TABLE |
| procs_priv                                           | BASE TABLE |
| proxies_priv                                         | BASE TABLE |
| replication_asynchronous_connection_failover         | BASE TABLE |
| replication_asynchronous_connection_failover_managed | BASE TABLE |
| replication_group_configuration_version              | BASE TABLE |
| replication_group_member_actions                     | BASE TABLE |
| role_edges                                           | BASE TABLE |
| server_cost                                          | BASE TABLE |
| servers                                              | BASE TABLE |
| slave_master_info                                    | BASE TABLE |
| slave_relay_log_info                                 | BASE TABLE |
| slave_worker_info                                    | BASE TABLE |
| slow_log                                             | BASE TABLE |
| tables_priv                                          | BASE TABLE |
| time_zone                                            | BASE TABLE |
| time_zone_leap_second                                | BASE TABLE |
| time_zone_name                                       | BASE TABLE |
| time_zone_transition                                 | BASE TABLE |
| time_zone_transition_type                            | BASE TABLE |
| user                                                 | BASE TABLE |
+------------------------------------------------------+------------+
38 rows in set (0.00 sec)

@deitch
Copy link
Collaborator

deitch commented Sep 18, 2024

At least they all are base tables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants