Guest PMU: Difference between revisions
From KVM
No edit summary |
m (added categories) |
||
(5 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
Guest PMU (Performance Monitoring Unit) currently exists in the form of an out-of-tree patchset. | Guest PMU (Performance Monitoring Unit) currently exists in the form of an out-of-tree patchset. | ||
See the [http://www.linux-kvm.org/wiki/images/6/6d/Kvm-forum-2011-performance-monitoring.pdf | See the [http://www.linux-kvm.org/wiki/images/6/6d/Kvm-forum-2011-performance-monitoring.pdf KVM Forum 2011 presentation] about the state as of August 2011. | ||
= Current status = | = Current status = | ||
Version | Version 2 Architectural PMU on Intel and AMD hosts is implemented and works. | ||
= TODO = | = TODO = | ||
== Guest visible features == | == Guest visible features == | ||
* Implement Version 3 Architectural PMU | * Implement Version 3 Architectural PMU | ||
* PEBS - Preceise Event Based Sampling - allows examining program state (problematic) | |||
* BTS - Branch Trace Store - allows tracing program execution accurately (problematic) | |||
* PEBS - Preceise Event Based Sampling - allows examining program state | |||
* BTS - Branch Trace Store - allows tracing program execution accurately | |||
== Accuracy == | == Accuracy == | ||
* | * Do not pin perf_events. Schedule them as a group and scale PM counter value. | ||
== Performance == | == Performance == | ||
* | * Let a guest control MSR_CORE_PERF_GLOBAL_CTRL if there is no guest monitoring event on a host. | ||
* Update the perf_event subsystem to make use of the PERF_GLOBAL_ENABLE MSR to speed up context switching, on both guest and host | * Update the perf_event subsystem to make use of the PERF_GLOBAL_ENABLE MSR to speed up context switching, on both guest and host | ||
* Check whether perf_event does unnecessary RMW operations on MSRs, which are significantly slow in a guest | * Check whether perf_event does unnecessary RMW operations on MSRs, which are significantly slow in a guest | ||
* Add a paravirt batch MSR read/write facility, update perf to use it when available | * Add a paravirt batch MSR read/write facility, update perf to use it when available | ||
* Change perf to use an ordinary interrupt instead of NMI when profiling only user space, or only a guest (reduces work in NMI context) | * Change perf to use an ordinary interrupt instead of NMI when profiling only user space, or only a guest (reduces work in NMI context) | ||
= Caveats = | |||
* NMI watchdog on a host reduces amount of performance counters available to a gust, but guest will not know this and if it will try to use all performance counters one of them will fail. It is better to disable NMI watchdog on a host. | |||
= Git repositories = | = Git repositories = | ||
* kernel: git://github.com/avikivity/kvm.git | |||
* QEMU: git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git | |||
= Putting it all together = | |||
Compile kernel and QEMU from repositories above. Reboot into the new kernel. Make sure that nmi watchdog is disabled on a host. Run QEMU with "-cpu host". Run perf inside a guest. | |||
[[Category:Architecture]][[Category:Docs]][[Category:Historical]] |
Latest revision as of 09:08, 31 May 2015
Guest PMU (Performance Monitoring Unit) currently exists in the form of an out-of-tree patchset.
See the KVM Forum 2011 presentation about the state as of August 2011.
Current status
Version 2 Architectural PMU on Intel and AMD hosts is implemented and works.
TODO
Guest visible features
- Implement Version 3 Architectural PMU
- PEBS - Preceise Event Based Sampling - allows examining program state (problematic)
- BTS - Branch Trace Store - allows tracing program execution accurately (problematic)
Accuracy
- Do not pin perf_events. Schedule them as a group and scale PM counter value.
Performance
- Let a guest control MSR_CORE_PERF_GLOBAL_CTRL if there is no guest monitoring event on a host.
- Update the perf_event subsystem to make use of the PERF_GLOBAL_ENABLE MSR to speed up context switching, on both guest and host
- Check whether perf_event does unnecessary RMW operations on MSRs, which are significantly slow in a guest
- Add a paravirt batch MSR read/write facility, update perf to use it when available
- Change perf to use an ordinary interrupt instead of NMI when profiling only user space, or only a guest (reduces work in NMI context)
Caveats
- NMI watchdog on a host reduces amount of performance counters available to a gust, but guest will not know this and if it will try to use all performance counters one of them will fail. It is better to disable NMI watchdog on a host.
Git repositories
Putting it all together
Compile kernel and QEMU from repositories above. Reboot into the new kernel. Make sure that nmi watchdog is disabled on a host. Run QEMU with "-cpu host". Run perf inside a guest.