Performance optimizations for backups check #240
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We are using the BackupsCheck for Database snapshots from your laravel-db-snapshots) package. We have over 200 snapshots in are storing them in a S3 bucket. Thanks for accepting our previous PR to allow checking external disks. However, this introduced some performance problems. The way the current BackupsCheck work if first checks the file size of all backup files and then checks the modified date. For each check the check needs to reach out to the external file and fetch that information. This takes somewhere around 0.05 - 0.1 seconds per check per file. With 200 backup files this takes up to about 200 * 2 * 0.1 = 40 seconds. That is obviously not OK.
This PR addresses that in two ways:
->parseModifiedFormat('Y-m-d_H-i-s')
->onlyCheckSizeOnFirstAndLast()
Those two things reduced the execution time for the check from about 30 seconds to somewhere between 0.6 and 0.8 seconds and it should scale well with even larger amounts of files.
We also added some metadata to the BackupsCheck and a helper function on the ´Result` class for appending metadata. This is handy for checks when you are accumulating the information as you perform the checks and don't have all the information in the beginning of the check.
It is used like this:
This merges the appended metadata to the existing metadata instead of overwriting.