VhostNet: Difference between revisions

From KVM
No edit summary
(redirect to the current page)
 
(8 intermediate revisions by 3 users not shown)
Line 1: Line 1:
vhost-net: a kernel-level virtio-net server
#REDIRECT [[UsingVhost]]
 
 
== What is vhost-net ==
 
vhost is a kernel-level backend for virtio.
The main motivation for vhost is to reduce virtualization
overhead for virtio by removing system calls on data path,
without guest changes. For virtio-net, this removes up to
4 system calls per packet: vm exit for kick, reentry for kick,
iothread wakeup for packet, interrupt injection for packet.
 
vhost is as minimal as possible. It relies on userspace for
all setup work.
 
=== Status ===
vhost is fully functional, and it already shows
improvement over userspace virtio.
 
 
== How to use ==
 
Download, build an install kernel from:
 
kernel:
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost
userspace:
git://git.kernel.org/pub/scm/linux/kernel/git/mst/qemu-kvm.git vhost
 
=== Usage instructions: ===
 
vhost currently requires MSI-X support in guest virtio.
This means guests kernel version should be >= 2.6.31.
 
To enable vhost, simply add ",vhost" flag to nic options.
Example with tap backend:
 
qemu-system-x86_64 -m 1G disk-c.qcow2 \
-net tap,ifname=msttap0,script=/home/mst/ifup,downscript=no \
-net nic,model=virtio,'''vhost'''
 
Example with raw socket backend:
 
ifconfig eth3 promisc
qemu-system-x86_64 -m 1G disk-c.qcow2 \
-net raw,ifname=eth3 \
-net nic,model=virtio,'''vhost'''
 
Note: in raw socket mode, when binding to a physical
ethernet device, host to guest communication
will only work if your device is connected to a bridge
configured to mirror outgoing packets back at the originating link.
If you do not know whether this is the case, this most likely
means it isn't. Use another box to access the guest, or use tap.
 
 
== Limitations ==
* vhost currently requires MSI-X support in guest virtio. This means guests kernel version should be >= 2.6.31.
* with raw sockets, host to guest, and guest to guest communication on the same host does not always work. Use  bridge+tap if you need that.
* driver unloading in guest and device hot-unplug are broken, because the relevant code in qemu is stubbed out. Need to implement them.
 
== Performance ==
 
Still tuning performance, especially guest to host.
Here are some numbers coutesy of Shirley Ma:
 
* netperf TCP_STREAM, default setup, 60 secs run
  guest->host increases from 3XXXMb/s to 5XXXMb/s
  host->guest increases from 3XXXMb/s to 4XXXMb/s
 
* TCP_RR, 60 secs run
  guest->host trans/s increases from 2XXX/s to 13XXX/s
  host->guest trans/s increases from 2XXX/s to 13XXX/s
 
== TODOs ==
 
=== vhost-net driver projects ===
* profiling would be very helpful, I have not done any yet.
* merged buffers.
* scalability tuning: figure out the best threading model to use.
 
=== qemu projects ===
* migration support
* level triggered interrupts
* driver unloading/hotplug
* general cleanup and upstreaming
* upstream support for injecting interrupts from kernel, from qemu-kvm.git to qemu.git (this is a vhost dependency, without it vhost can't be upstreamed, or it can, but without real benefit)
 
=== virtio projects ===
* improve small packet/large buffer performance: support "reposting" buffers, pool for indirect buffers
* guest kernel 2.6.31 seems to work well. Under certain workloads,
virtio performance has regressed with guest kernels 2.6.32 and up
(but still better than userspace). A patch has been posted:
http://www.spinics.net/lists/netdev/msg115292.html
 
=== projects involing other kernel components and/or networking stack ===
* rx mac filtering in tun
* extend raw sockets to support GSO/checksum offloading, and teach vhost to use that capability [one way to do this: virtio net header support]; will allow working with e.g. macvlan
* improve locking: e.g. RX/TX poll should not need a lock
* multicast ICMPs snooping in bridge
 
 
=== long term projects ===
* kvm eventfd support for injecting level interrupts
* multiqueue (involves all of vhost, qemu, virtio, networking stack)
* zero copy tx for tun/raw sockets
 
=== Other ===
* More testing is always good
 
== Short term plans for MST ==
 
* get vhost net merged in linux kernel 2.6.33
* address most vhost qemu TODOs
* get vhost support merged in upstream qemu

Latest revision as of 04:21, 20 September 2010

Redirect to: