Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thousands of %s (wrong) in default mode for heavily multi-threaded application #737

Open
yarikoptic opened this issue Dec 18, 2024 · 0 comments

Comments

@yarikoptic
Copy link

We have a tool downloading and displaying progress (https://github.com/dandi/dandi-cli/, dandi on pypi). Upon "heavily threaded" (py-spy reports 111 threads) execution, after awhile for a call like py-spy top -p 893263 --delay 5 we would get

20.47s behind in sampling, results may be inaccurate. Try reducing the sampling rate.
Collecting samples from '.... dandi download -J 10:10 --path-type glob https://dandiarchive.org/dandiset/000026/draft/files?location=sub-I48%2Fses-SPIM%2Fmicr%2F*.zarr' (python v3.11.2)
Total Samples 1502
GIL: 59.48%, Active: 1396.41%, Threads: 111

  %Own   %Total  OwnTime  TotalTime  Function (filename)                                                                                                                                                                                                          
1048.90% 1048.90%   149.1s    149.1s   acquire (logging/__init__.py)
109.38% 109.38%   17.86s    17.86s   read (ssl.py)
 67.86%  67.86%   10.03s    10.03s   <genexpr> (dandi/download.py)
 22.95%  22.95%    3.56s     3.56s   get (interleave/__init__.py)
  7.19%   7.19%    1.33s     1.33s   flush (logging/__init__.py)
  6.59%   6.59%    1.13s     1.13s   _rmtree_safe_fd (shutil.py)
  5.99% 1249.30%   0.990s    178.5s   _download_file (dandi/download.py)
  5.79%   5.79%   0.920s    0.920s   __hash__ (enum.py)
  4.99%   8.78%   0.900s     1.74s   __init__ (logging/__init__.py)

with total times and %s seems to be multiple of some number of threads. If we add --nonblocking -- it becomes more reflective of actual time without multiple of threads, e.g.

*$> py-spy top -p 893263 --nonblocking --delay 5
18.06s behind in sampling, results may be inaccurate. Try reducing the sampling rate.
Collecting samples from '... dandi download -J 10:10 --path-type glob https://dandiarchive.org/dandiset/000026/draft/files?location=sub-I48%2Fses-SPIM%2Fmicr%2F*.zarr' (python v3.11.2)
Total Samples 1502, Error Rate 5.33% (Failed to copy PyCodeObject)
GIL: 51.50%, Active: 205.79%, Threads: 111

  %Own   %Total  OwnTime  TotalTime  Function (filename)                                                                                                                                                                                                          
 61.28%  61.28%    9.17s     9.17s   read (ssl.py)
 38.52%  38.52%    5.61s     5.61s   <genexpr> (dandi/download.py)
 29.34%  29.34%    4.52s     4.52s   get (interleave/__init__.py)
 22.36%  22.36%    2.94s     2.94s   acquire (logging/__init__.py)
  4.79% 122.95%   0.740s    17.78s   _download_file (dandi/download.py)
  3.19%   3.19%   0.520s    0.520s   __hash__ (enum.py)
  3.79%   3.79%   0.500s    0.500s   _rmtree_safe_fd (shutil.py)
  1.80%   1.80%   0.470s    0.470s   append (dandi/download.py)

and overall proportions making more sense to me.

NB that last execution of py-spy segfaulted and exited with exit code 139. dmesg said

[16533900.194708] traps: py-spy[1839777] general protection fault ip:7f6737d7aadf sp:7ffd42e1b240 error:0 in py-spy[7f6737c00000+53f000]

our dandi process it monitored seems continued normally.

$> py-spy --version
py-spy 0.4.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant