CERN Accelerating science

The Overhead of Profiling using PMU Hardware Counters

Date published: 
Tuesday, 8 July, 2014
Document type: 
CERN openlab report
A. Nowak
G. Bitzes
Run-time profiling of executable binaries can offer valuable insight into the performance characteristics and behaviour of a program. Some methods, such as instrumentation, are invasive and involve modifications of the profiled binary. This can significantly impact performance, to the point that an instrumented binary runs many times slower than the original. The Performance Monitoring Unit found in many modern processors offers the possibility of low-overhead profiling through a plethora of performance events. In this report, we investigate and quantify this overhead for a variety of tests and configurations, using the “perf” tool of the Linux kernel. Results for four main usage modes of the PMU are included: counting, sampling, PEBS events, and Last Branch Record (LBR).