Failover
From KVM
Failover for networking devices
Current status (as of January 2019): The guest driver model is described in linux kernel Documentation/networking/net_failover.rst
The hypervisor part, especially how to unplug, re-plug the SR-IOV device is still an open problem especially with respect to minimize the downtime/loss of packets in case of a migration.
Problems/Questions
- Packet loss due to early mac filter update
- Some NIC drivers will update the MAC filter as soon as a vf is created, but before the vf driver is loaded in the guest and the vf device is ready. Therefore packets are not going to the standby(virtio) device but to the pf until the guest is up and the vf driver is loaded.
- Todos: create a tool to test if a NIC driver acts like this. Idea: test without a VM involved. In host test where packets go when a mac address is added to a vf on a vlan and same mac address is added to pf. pf should not be in promiscuous mode(?). Packets can be generated and send with tool 'mausezahn' of netsniff-ng and it can be determined where packets end up by using 'netsniff-ng' on pf and vf device. Status: set up tools and environment to test (2019-01-16)
- What to do with the results of this test? 1. Can we add a flag to the device to mark it as not usable for failover? Where to put the flag
- How to support hotplug of a primary/SR-IOV device for failover. Guest is already up, SR-IOV device is hotplugged in hypervisor. Device shows up in guest. How can we make it primary device in a failover/standby/primary trio?
- How to involve management layer in migration process? Patches sent from Oracle with rationale to make it work with old nics as well. Idea is to sent events for busmaster enable/disable.
- Can it be made race free? Probably yes, but we'd need to stop vcpu.
- Can we use switchdev to program FDBs on NIC and redirect traffic from pf to vf. Can we avoid need to stop vcpu with this? Which commands to use to program FDB entries offloaded to NIC (bridge?)
- Mechanisms to pci device removal
- pci surprise removal (as defined in pci(e) spec). might be buggy in some linux drivers, but fixes are welcome, surprise removal expected to work in general. What about EMI (electric mechanical interlock) support in Linux/Qemu?
- ordered removal/with guest cooperation. hw has 'attention' button. send interrupt to guest, guest ejects device. what needs to be done for that? probably only with q35 chip set. QEMU press attention button: 'pcie_abp 0'
- PCI Overview - Qemu: https://wiki.qemu.org/images/f/f6/PCIvsPCIe.pdf