Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added -m flag to gsutil cp in GSURL file handler localization command #147

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dpmerrell
Copy link

From the docs (https://cloud.google.com/storage/docs/gsutil/addlhelp/GlobalCommandLineOptions)

Using the -m option can consume a significant amount of network bandwidth and cause problems or make your
performance worse if you use a slower network. For example, if you start a large rsync operation over a network
link that's also used by a number of other important jobs, there could be degraded performance in those jobs.
Similarly, the -m option can make your performance worse, especially for cases that perform all operations
locally, because it can "thrash" your local disk.

To prevent such issues, reduce the values for parallel_thread_count and parallel_process_count, or stop using the
-m option entirely. One tool that you can use to limit how much I/O capacity gsutil consumes and prevent it from
monopolizing your local disk is ionice (built in to many
Linux systems).

I would guess the typical network speeds between GCP storage and compute are fast. A complete/robust solution may involve configuring the parallel_thread_count or parallel_process_count, or invoking ionice.

Thought I would open this PR anyways, just to get this started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant