|
|
(21 intermediate revisions by 3 users not shown) |
Line 1: |
Line 1: |
| vhost-net: a kernel-level virtio-net server
| | #REDIRECT [[UsingVhost]] |
| | |
| | |
| == What is vhost-net ==
| |
| | |
| vhost is a kernel-level backend for virtio.
| |
| The main motivation for vhost is to reduce virtualization
| |
| overhead for virtio by removing system calls on data path,
| |
| without guest changes. For virtio-net, this removes up to
| |
| 4 system calls per packet: vm exit for kick, reentry for kick,
| |
| iothread wakeup for packet, interrupt injection for packet.
| |
| | |
| vhost is as minimal as possible. It relies on userspace for
| |
| all setup work.
| |
| | |
| === Status ===
| |
| vhost is fully functional, and it already shows
| |
| improvement over userspace virtio.
| |
| | |
| | |
| == How to use ==
| |
| | |
| Download, build an install kernel from:
| |
| | |
| kernel:
| |
| git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost
| |
| userspace:
| |
| git://git.kernel.org/pub/scm/linux/kernel/git/mst/qemu-kvm.git vhost
| |
| | |
| === Usage instructions: ===
| |
| | |
| vhost currently requires MSI-X support in guest virtio.
| |
| This means guests kernel version should be >= 2.6.31.
| |
| | |
| To enable vhost, simply add ",vhost" flag to nic options.
| |
| Example with tap backend:
| |
| | |
| qemu-system-x86_64 -m 1G disk-c.qcow2 \
| |
| -net tap,ifname=msttap0,script=/home/mst/ifup,downscript=no \
| |
| -net nic,model=virtio,'''vhost'''
| |
| | |
| Example with raw socket backend:
| |
| | |
| ifconfig eth3 promisc
| |
| qemu-system-x86_64 -m 1G disk-c.qcow2 \
| |
| -net raw,ifname=eth3 \
| |
| -net nic,model=virtio,'''vhost'''
| |
| | |
| Note: in raw socket mode, when binding to a physical
| |
| ethernet device, host to guest communication
| |
| will only work if your device is connected to a bridge
| |
| configured to mirror outgoing packets back at the originating link.
| |
| If you do not know whether this is the case, this most likely
| |
| means it isn't. Use another box to access the guest, or use tap.
| |
| | |
| | |
| == Limitations ==
| |
| * vhost currently requires MSI-X support in guest virtio. This means guests kernel version should be >= 2.6.31.
| |
| * with raw sockets, host to guest, and guest to guest communication on the same host does not always work. Use bridge+tap if you need that.
| |
| | |
| | |
| == Performance ==
| |
| | |
| Still tuning performance, especially guest to host.
| |
| Here are some numbers coutesy of Shirley Ma:
| |
| | |
| * netperf TCP_STREAM, default setup, 60 secs run
| |
| guest->host drops from 3XXXMb/s to 1XXXMb/s (regression)
| |
| host->guest increases from 3XXXMb/s to 4XXXXMb/s
| |
| | |
| * TCP_RR, 60 secs run
| |
| guest->host trans/s increases from 2XXX/s to 13XXX/s
| |
| host->guest trans/s increases from 2XXX/s to 13XXX/s
| |
| | |
| | |
| == TODOs ==
| |
| * Fix guest to host performance regression - working on it now
| |
| | |
| === vhost-net driver projects ===
| |
| * profiling would be very helpful, I have not done any yet.
| |
| * vm exit mitigation for TX - working on it now.
| |
| * logging support with dirty page tracking in kernel - working on it now
| |
| * merged buffers.
| |
| | |
| === qemu projects ===
| |
| * migration support
| |
| * level triggered interrupts
| |
| * general cleanup and upstreaming
| |
| * upstream support for injecting interrupts from kernel,
| |
| from qemu-kvm.git to qemu.git
| |
| (this is a vhost dependency, without it vhost
| |
| can't be upstreamed, or it can, but without real benefit)
| |
| | |
| === projects involing other kernel components and/or networking stack ===
| |
| * extend raw sockets to support GSO/checksum offloading,
| |
| and teach vhost to use that capability
| |
| [one way to do this: virtio net header support] | |
| will allow working with e.g. macvlan
| |
| * improve locking: e.g. RX/TX poll should not need a lock
| |
| * kvm eventfd support for injecting level interrupts
| |
| | |
| === long term projects ===
| |
| * multiqueue (involves all of vhost, qemu, virtio,
| |
| networking stack)
| |
| | |
| | |
| === Other ===
| |
| * More testing is always good
| |