This (long delayed) post describes the original motivation for my PmcTools whole-system profiling toolkit, and touches on some of the possible next steps for the project.
Around the year 2000, I happened to read a paper titled “Continuous Profiling: Where Have All the Cycles Gone?". The techniques described by the paper were inspiring, but the DEC Alpha™ systems they were implemented on were out of reach for a hobbyist living in India.
A FreeBSD™-equivalent of those tools and techniques, running on affordable hardware, seemed a good idea.
From the outset, my goal was to create a programming toolkit for using in-CPU performance counters:
- I wanted an API that would permit tools to fully use the features provided by the hardware.
- I wanted tools that had low overheads, in order not to disturb the behaviour of the system being measured.
- I wanted tools that were non-disruptive—usable without needing to restart running processes, rebooting the system or requiring recompilation, etc.
- I wanted to analyse the “whole system” at once; i.e., to simultaneously analyze userspace applications, the top-half of the kernel, and the kernel’s interrupt handlers.
- I wanted the toolset to be SMP-ready, since SMP seemed affordable in the future.
When affordable systems using AMD Athlon™ CPUs (with publically documented in-CPU performance counters) entered the Indian market in early 2003, I built myself a machine, and started on the project.
- 2003: Initial work, which was managed using homebrew tools tuned for dialup speeds (shell scripts running RCS layered over CVS/CVSup).
- 2004: With the arrival of broadband access, development moved to
FreeBSD’s Perforce™ server.
- 2005: The first check-in into the FreeBSD source tree in April 2005.
At the time of writing, PmcTools is being actively maintained and extended by the FreeBSD community.
Platforms, simplicity and portability are likely to be the focus of future work.
PmcTools would be useful on popular hobby platforms such as the BeagleBoard. PmcTools already supports ‘remote’ data collection on embedded systems. However, the specific PMCs on these systems would need to be supported.
Based on the experience gained so far, both the programming APIs and the implementation of PmcTools could be simplified without losing useful functionality.
PmcTools would be a useful addition to other open-source operating systems.
In addition to the above, many innovative tools can be created: in the paper “Exploiting hardware performance counters with flow and context sensitive profiling”, the authors show how to add PMC-based instrumentation to program binaries for fine-grained analyses. To be able to add such instrumentation, we need tools to parse and modify binary instruction streams—one of the motivations for the proposed libmc library, part of the Elftoolchain project.