Partitioned PMU Part 2
If you aren't already familiar with virtual machines, KVM, and the surrounding terminology, I suggest reading part 1 first.
Summary Restatement
For isolation reasons, KVM uses an entirely emulated PMU for ARM and it's slow. My feature allows some of the counters to be reserved for guests, which allows guests to bypass the slow emulation and use some of the hardware functionality directly and that's a lot faster.
How KVM Emulates The PMU
I realized after finishing my last post that some additional explanation may be useful regarding how KVM "emulates" hardware such as the PMU. The whole reason KVM exists in the kernel in the first place is to let hardware directly run the guest software in a safe manner. That means that while the guest is running, guest code is directly executing on the CPU. Since the PMU is part of the CPU and controlled by its own set of registers, it might be logical to assume the guest could directly operate the PMU.
KVM could do it that way, but that would mean giving the whole PMU to the guest and prevent KVM from enforcing constraints on how the guests can use the PMU. Some events, for example, are sensitive information and could leak information about how the host operates into the guest.
What KVM does instead is trap PMU register accesses.
CPU Traps
Not all software running on a CPU runs with the same privileges. If it did, any userspace application could own the system by executing the right instructions. Instead there are different levels, called exception levels in ARM CPUs, that control which instructions the currently running software is allowed to access. The most privileged software like firmware and hypervisors run earliest in the boot process and set up protections to prevent less privileged software from causing any harm.
One protection the more privileged software can configure is called trapping. When a trap is configured and the less privileged software executes the trapped instruction, the instruction does not execute. Instead an exception is generated that returns control to the more privileged level.
KVM sets up these traps for PMU register accesses, so when a guest tries to access any PMU registers, the CPU returns control to KVM instead and KVM decides whether and how to serve the purpose the guest wanted by executing that instruction. As an example, certain PMU register accesses may trigger KVM to register an event with the host kernels performance monitoring subsystem that then shares PMU access between that event and events from other guests and the host.
Effects Of Emulation
Because traps turn what would be a single machine instruction on bare metal into hundreds or thousands of instructions for KVM to figure out what to do about it, emulation is secure because KVM can choose to do whatever it wants with the guest instruction, but emulation is also slow.
New Features
Another reason for this KVM design was that in older versions of ARM CPUs, trapping PMU register access was an all or nothing proposition. You either trapped everything or nothing. More recently, ARM introduced a feature called Fine Grain Traps (FGT) that allows software to trap and untrap many individual PMU registers.
This can be combined with another feature that can partition the counter registers into two sets. The first set is accessible only to the the highly privileged host kernel, and the second set accessible to the less privileged guest.
With these new features, KVM can safely allow guests some measure of direct access to the PMU and turn off the expensive traps.