Question

I'm doing event-based sampling with the perf userland tool: the objective being trying to find out where certain performance-impacting events like branch misses and cache misses are occurring on a larger system I'm working on.

Now, something like

perf record -a -e branch-misses:pp -- sleep 5

works perfectly: the PEBS counting mode trigerred by the 'pp' modifier is really accurate when collecting the IP in the samples.

Unfortunately, when I try to do the same for cache-misses, i.e.

perf record -a -e cache-misses:pp -- sleep 5 # [1]

I get

Error: sys_perf_event_open() syscall returned with 22 (Invalid argument). /bin/dmesg may provide additional information.

Fatal: No CONFIG_PERF_EVENTS=y kernel support configured?

dmesg | grep "perf\|pmu" shows nothing useful AFAICT. I'm also pretty sure that the kernel was compiled with CONFIG_PERF_EVENTS=y because both [1] and

perf record -a -e cache-misses -- sleep 5 # [2]

work : the problem with [2] being that the collected samples are not very accurate, which hurts my profiles.

Any hints on what could be going on here?

Was it helpful?

Solution

It turns out the specific event that the generic cache-misses maps to does not support PEBS. An alternative is to use one of the events that are supported by PEBS (see the list for the Nehalem architecture here) with an appropriate mask to narrow it down. Specifically, one could use MEM_LOAD_RETIRED:LLC_MISS, even though the event doesn't seem to be accurate on all occasions.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top