utziacrekernel-op8-rt-20240705
Latest
memlat: Optimize perf event reads when possible
We can skip the locking and other overhead of perf_event_read_value()
when we know in advance that the perf event in question can be read from
the current CPU. This occurs when either the perf event permits reads
from CPUs other than the one its on, or when the CPU doing the reads is
the same CPU that owns the perf event.
Our PMU drivers only set two possible values for `readable_on_cpus`:
CPU_MASK_ALL or nothing. As such, we can simply check for CPU_MASK_ALL
beforehand in order to determine if the perf event allows non-local
reads.
We can also reduce the scope of under_scm_call() since we now know which
CPU we're reading a perf event from, thus reducing the false positive
rate of under_scm_call() as it is now per-CPU.
Signed-off-by: Sultan Alsawaf <[email protected]>