VhostNet
vhost-net: a kernel-level virtio-net server
What is vhost-net
vhost is a kernel-level backend for virtio. The main motivation for vhost is to reduce virtualization overhead for virtio by removing system calls on data path, without guest changes. For virtio-net, this removes up to 4 system calls per packet: vm exit for kick, reentry for kick, iothread wakeup for packet, interrupt injection for packet.
vhost is as minimal as possible. It relies on userspace for all setup work.
Status
vhost is fully functional, and it already shows improvement over userspace virtio.
How to use
Download, build an install kernel from:
kernel:
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost
userspace:
git://git.kernel.org/pub/scm/linux/kernel/git/mst/qemu-kvm.git vhost
Usage instructions:
vhost currently requires MSI-X support in guest virtio. This means guests kernel version should be >= 2.6.31.
To enable vhost, simply add ",vhost" flag to nic options. Example with tap backend:
qemu-system-x86_64 -m 1G disk-c.qcow2 \ -net tap,ifname=msttap0,script=/home/mst/ifup,downscript=no \ -net nic,model=virtio,vhost
Example with raw socket backend:
ifconfig eth3 promisc qemu-system-x86_64 -m 1G disk-c.qcow2 \ -net raw,ifname=eth3 \ -net nic,model=virtio,vhost
Note: in raw socket mode, when binding to a physical ethernet device, host to guest communication will only work if your device is connected to a bridge configured to mirror outgoing packets back at the originating link. If you do not know whether this is the case, this most likely means it isn't. Use another box to access the guest, or use tap.
Limitations
- vhost currently requires MSI-X support in guest virtio. This means guests kernel version should be >= 2.6.31.
- with raw sockets, host to guest, and guest to guest communication on the same host does not always work. Use bridge+tap if you need that.
Performance
Still tuning performance, especially guest to host. Here are some numbers coutesy of Shirley Ma:
- netperf TCP_STREAM, default setup, 60 secs run
guest->host increases from 3XXXMb/s to 5XXXMb/s host->guest increases from 3XXXMb/s to 4XXXXMb/s
- TCP_RR, 60 secs run
guest->host trans/s increases from 2XXX/s to 13XXX/s host->guest trans/s increases from 2XXX/s to 13XXX/s
TODOs
vhost-net driver projects
- profiling would be very helpful, I have not done any yet.
- merged buffers.
- scalability tuning: figure out the best threading model to use.
qemu projects
- migration support
- level triggered interrupts
- driver unloading/hotplug
- general cleanup and upstreaming
- upstream support for injecting interrupts from kernel, from qemu-kvm.git to qemu.git (this is a vhost dependency, without it vhost can't be upstreamed, or it can, but without real benefit)
projects involing other kernel components and/or networking stack
- extend raw sockets to support GSO/checksum offloading, and teach vhost to use that capability [one way to do this: virtio net header support]; will allow working with e.g. macvlan
- improve locking: e.g. RX/TX poll should not need a lock
- kvm eventfd support for injecting level interrupts
long term projects
- multiqueue (involves all of vhost, qemu, virtio, networking stack)
Other
- More testing is always good