Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@pstats frequently reports 0 cpu cycles #50

Open
topolarity opened this issue Dec 11, 2024 · 4 comments
Open

@pstats frequently reports 0 cpu cycles #50

topolarity opened this issue Dec 11, 2024 · 4 comments

Comments

@topolarity
Copy link
Member

julia> @pstats rand(1000,1000)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
┌ cpu-cycles               0.00e+00   47.4%  #  0.0 cycles per ns
│ stalled-cycles-frontend  0.00e+00   47.4%  #  NaN% of cycles
└ stalled-cycles-backend   0.00e+00   47.4%  #  NaN% of cycles
┌ instructions             1.50e+07   52.7%  #  Inf insns per cycle
│ branch-instructions      2.38e+05   52.7%  #  1.6% of insns
└ branch-misses            8.35e+01   52.7%  #  0.0% of branch insns
┌ task-clock               3.81e+06  100.0%  #  3.8 ms
│ context-switches         0.00e+00  100.0%
│ cpu-migrations           0.00e+00  100.0%
└ page-faults              0.00e+00  100.0%
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

This happens for me about ~20-50% of the time.

I'm not sure why it thinks it measured 47% of the running time but measured no cycles at all.

@Zentrik
Copy link
Collaborator

Zentrik commented Dec 11, 2024

Maybe all your PMU counters are getting used up?

@topolarity
Copy link
Member Author

That seems likely, but I still thought the effective running time calculation should have given us an indication (i.e. 47.4% should have been 0.0% and there should have been a warning)

@topolarity
Copy link
Member Author

@Zentrik I'm worried this might be related to the usage of PR_TASK_PERF_EVENTS_DISABLE

If I apply this diff:

diff --git a/src/LinuxPerf.jl b/src/LinuxPerf.jl
index 7326cb3..a5b1742 100644
--- a/src/LinuxPerf.jl
+++ b/src/LinuxPerf.jl
@@ -1138,9 +1138,9 @@ macro pstats(args...)
             @debug dump_groups(groups)
             bench = make_bench_threaded(groups, threads = $(opts.threads))
             try
-                enable_all!()
+                enable!(bench)
                 val = $(esc(expr))
-                disable_all!()
+                disable!(bench)
                 # trick the compiler not to eliminate the code
                 @static if isdefined(Base, :donotdelete)
                     Base.donotdelete(val)

then the problem appears to go away.

Maybe the global disable is not accounting for being in the middle of a measurement, etc. so the timers become inaccurate?

@topolarity
Copy link
Member Author

It's possible this is a kernel bug, so it'd be good to check on more HW / kernel versions to see if this is fixed on newer kernels or different hardware.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants