-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
OCSDV-343: Resolve lingering part files when S3 -> EFS transfers fail
This commit changes the behavior of the `DataSyncOperations.sync_s3_to_local()` method so that when it is downloading from a discrete s3 object -> local filesystem (usually EFS), it will download the s3 object to a custom temporary location on the same file system as the destination and then `os.rename` to the final destination path. When Boto3 downloads an s3 object, it will create a temporary version of the file in the SAME parent directory as the destination path (e.g. "/file_system/dst/path/input_data.fastq.gz" will result in a temporary file at "/file_system/dst/path/input_data.fastq.gz.{unique_hash}") This default behavior by Boto3 is normally okay, but becomes problematic when two specific things happen: 1. A data sync is interrupted in a way that Boto3 doesn't have time to clean up the temporary file - Subsequent runs of data sync also can't clean up the file because our data sync logic only knows about syncing the single object and has no knowledge about the lingering temporary file (from a previous sync attempt) 2. A scientific executable that we need to support (e.g. cellranger) is excessively greedy when looking for putative FASTQ input files and will even grab partial files with names like `*.fastq.gz.6eF5b5da`. By saving the partial files during the Boto3 object download process in a different location and only atomically moving files which have completed transfer we can avoid this situation.
- Loading branch information
Showing
2 changed files
with
82 additions
and
15 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters