-
Notifications
You must be signed in to change notification settings - Fork 9
How Often Arrays Can Be Scrubbed Without Reducing HDD Life
- Most CoW filesystems either recommend monthly scrubs (ZFS, Btrfs) or perform monthly scrubs automatically (ReFS + Storage Spaces)
- Many of the inputs are estimates/informed guesses
- The calculations are conservative, meaning they err on the side of preserving HDD life
- The biggest single component of workload will be the scrub operation, which reads all the data stored on each drive (but NOT the entire drive)
- The all caps function names in the code snippets are Excel functions
- The scrub time will need to be recomputed as the source dataset size grows
- Variable names are CamelCase
- This method can be used for drives with known workload ratings
- The base unit of time used is 1 week (7 days), but a different one can be used via the method described in STEP 1 below
- This method applies to any redundant backup array targeted by an incremental backup method
- This method does not account for read/write resulting from snapshot pruning; hopefully the conservatism built into the calculations covers that
This is SourceDatasetSize
If the HDD's workload rating is already known, skip to STEP 2.
The Toshiba L200 is used as an example. Based on datasheets, Toshiba HDDs have 3 annual workload tiers: Unlimited, 550 TB, 180 TB, 72 TB, and unrated. Assuming "unrated" is a lower number than 72, multiply that number by the average fraction of each tier over the next higher one:
AnnualWorkloadRating=AVERAGE(550/infinity, 180/550, 72/180)*72
This is a very conservative estimate; it's basically the minimum the HDD can be expected to handle. It may be a valid assumption to use the lowest published workload rating of 72 TB, but that is left to the user to decide.
This is as simple as:
WeeklyWorkloadRating=AnnualWorkloadRating/NumberOfTimeUnitsPerYear
which, for weeks, boils down to:
WeeklyWorkloadRating=AnnualWorkloadRating/52
This calculation can be adjusted to a daily value (useful for multiple snapshots per day) by dividing by 365 instead. Similarly, monthly values can be computed by dividing by 12, etc.
The variable MinimumWeeksBetweenScrubs
is defined to represent the smallest number of weeks between scrubs.
If most of the dataset comes from downloaded files, Use the ISP's data usage meter:
WeeklySourceDatasetChange=AverageMonthlyDataUsage/WeeksPerMonth
Which collapses to:
WeeklySourceDatasetChange=AverageMonthlyDataUsage/4.33
If there are other (heavy, streaming uses a lot of data so this is a reasonable assumption) users sharing the same connection, and only a single user's data is being backed up, that number can be further reduced by:
WeeklySourceDatasetChange=AverageMonthlyDataUsage/NumberOfUsers/4.33
At the very least, the backup system should capture all the dataset changes in a week (or other preferred base time unit). So:
WeeklySourceDatasetChange=WeeklyWorkloadRating-(SourceDatasetSize/MinimumWeeksBetweenScrubs)
Solving the above for MinimumWeeksBetweenScrubs
:
MinimumWeeksBetweenScrubs=SourceDatasetSize/(WeeklyWorkloadRating-WeeklySourceDatasetChange)
This latter value does NOT imply only 1 snapshot per week. Rather, it describes the maximum amount of changed data per week any number of snapshots decided on can cover without exceeding the drive's workload rating.
Organized Alphabetically:
- Explainers
- How Linux, BSD, UNIX, and macOS Relate to Each Other
- Why I Use Resilio Sync Instead of Syncthing
- Why US Buyers Should Purchase Datacenter HDDs instead of NAS HDDs
- Why You Should Separate Compute and Backup Workloads
- Why You Shouldn't Stress Test HDDs Unless You're Trying to Maximize Uptime
- Why You Shouldn't Use Most Premade NAS Solutions
- Guides
- Disaster Recovery and Backups for OpenRC BSDs to non ZFS Repositories
- Disk Encryption Options
- How Much Raw Storage You'll Need for RAID
- How Often Arrays Can Be Scrubbed Without Reducing HDD Life
- How to Calculate the Odds of Physical Attack Data Loss for a ZFS Array
- How to Configure a Samba Server
- How to Generate an Affordable Server or NAS Parts List
- How to Get Your Home Wired for Ethernet
- How to Install OpenIndiana
- How to Install Pycharm on Debian from the JetBrains script
- How to Set Up Regular, Recurring, Incremental, Online Filesystem Backups using Restic
- How to Set Up Regular, Recurring, Recursive, Incremental, Online, In Place Filesystem Backups Using zfsnap
- How to Store HDDs Long Term
- How to Update dnscrypt proxy in Debian with Minimal Downtime
- Projects
- Ongoing
- Future (in order of descending priority/implementation)
- Recommended Hardware
- Recommended Software
- Troubleshooting
- Useful Links