Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow limiting number of plotting processes during Stage 1 #177

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

zmeyc
Copy link
Contributor

@zmeyc zmeyc commented Mar 9, 2021

This PR is based on unmerged #176
Please review only the last commit, I'll rebase once 176 is merged. Can be merged separately if needed.
Will require adding commandline switches in chia-blockchain as well.

Sample use case: when starting 16 plotters (each using 4 threads) on 16 thread CPU, it's best to stagger their runs to not lock the computer down during stage 1 (multithreading is supported only in stage 1).

Currently this has to be done with external script:

  1. Measure approximate stage 1 time
  2. Run 4 plotters
  3. Sleep measured time
  4. Run next 4 plotters
  5. Sleep measured time
    etc

But this setup is error-prone:

  • on any slowdown plotter groups will clash
  • It's not possible to restart crashed plotters because all timings will be off

As a solution, this patch uses lock files to allow only up to a specified number of processes in Stage 1. Other plotters will wait for their turn. Plotters can be stopped and restarted at any time without consequences. Locks (slots) are automatically released if plotters crash or are stopped.

How to test:

./ProofOfSpace create -k 22 --p1maxproc 2 --runtimedir . -d dest1
./ProofOfSpace create -k 22 --p1maxproc 2 --runtimedir . -d dest2
./ProofOfSpace create -k 22 --p1maxproc 2 --runtimedir . -d dest3

Runtimedir should be the same (on prod can default to ~/.config/chia/run)
Third process will wait until first two leave Stage 1.

Base automatically changed from master to main March 15, 2021 18:39
@arvidn
Copy link
Contributor

arvidn commented Mar 20, 2021

I would expect that setting the priority to the threads really low (or high nice-level) would also solve this problem, or at least mitigate it.

Have you tested that?

If that would work well. it would be a bit more scalable going forward. We may want to parallelize more steps, like sorting, in the future. And if the OS scheduler could sort this out itself, it would be ideal.

@zmeyc
Copy link
Contributor Author

zmeyc commented Mar 21, 2021

@arvidn Interesting idea, I haven't tried that. it might help with GUI not getting frozen when overprovisioning threads, I'll check if it helps. But it will still result in Phase 1 taking longer overall if all processes will be simultaneously in it. It might be worth setting low priority as additional measure.

Another thing of concern is adding more args to Python bindings. Possibly it's better to pass params as struct in CreatePlotDisk? For example, Deno's linter has a rule A function that is part of the public API takes 0-2 required arguments, plus (if necessary) an options object (so max 3 total). I try to follow this not only in TS, imo it's more error-prone because params are passed by name and order doesn't matter. It will be easier to add/remove fields too.

@zmeyc
Copy link
Contributor Author

zmeyc commented Mar 21, 2021

Required changes in chia-blockchain:
zmeyc/chia-blockchain@46bb5f3

@zmeyc
Copy link
Contributor Author

zmeyc commented Mar 22, 2021

Lowering priorities but setting them to different values for each plotter could also work. Theoretically this will allow for one plotter to be prioritized and finish faster and not stall all of them.

When multiple plotter instances are working, use flock to prevent
them from copying files to destination disk simultaneously
on rotational media.

Assume media is non-rotational by default.

Linux only, on Windows and MacOS the behavior is unchanged.
Allow specifying maximum number of simultaneous plot processes
entering phase 1 for staggered plotting.
Temporary files are used as flags: one file per process.
Sample cli usage:
mkdir -p /home/user/.config/chia/run
./ProofOfSpace create -k 24 --p1maxproc 2 --runtimedir /home/user/.config/chia/run
@github-actions
Copy link

'This PR has been flagged as stale due to no activity for over 60
days. It will not be automatically closed, but it has been given
a stale-pr label and should be manually reviewed.'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants