7.7.5. vNIC Failover

In this subsection, the functionality of the vNIC failover is to be examined more closely. If a vNIC adapter has exactly two vNIC backing devices, the second vNIC backing device will inevitably be activated if the currently active vNIC backing device fails, and a real selection cannot take place. The situation is different if there are more than two vNIC backing devices. If the active vNIC backing device fails, there are at least two further vNIC backing devices. In this case, the failover priority (attribute failover_priority) of the vNIC backing devices comes into play. The hypervisor selects the vNIC backing device that has the highest priority (lowest value).

To take a closer look at the vNIC failover in the general case, we add a third vNIC backing device to the vNIC adapter in slot 6 of the LPAR aix22. A third virtual I/O server ms03-vio3 is used for maximum redundancy:

$ lpar addvnicbkdev aix22 6 ms03-vio3 C4-T3 failover_priority=55
$

This means the vNIC adapter of LPAR aix22 has the following vNIC backing devices:

$ lpar lsvnic -a aix22
                           FAILOVER                                         PHYS  LOGICAL   CURRENT     MAX  
LPAR_NAME  SLOT  FAILOVER  PRIORITY  ACTV  STATUS       VIOS_NAME  ADAPTER  PORT  PORT      CAPACITY  CAPACITY
aix22  6     Yes       50        1     Operational  ms03-vio1  1        0     27004005  2.0       100.0
aix22  6     Yes       60        0     Operational  ms03-vio2  2        0     27008004  2.0       100.0
aix22  6     Yes       55        0     Operational  ms03-vio3  3        2     2700c00a  2.0       100.0
$

Figure 7.22 shows the interaction between the vNIC backing devices (vNIC server) and the POWER hypervisor. Each vNIC server monitors the status of its associated logical SR-IOV port. The status is reported to the hypervisor in the form of a heartbeat message at short intervals.

Interaction of vNIC backing devices and hypervisor for vNIC failober.
Figure 7.22: Interaction of vNIC backing devices and hypervisor for vNIC failover.

If the link of the left physical port fails in this configuration, the associated vNIC server on the first virtual I/O server notices this during it is monitoring the logical SR-IOV port and reports this to the hypervisor in a heartbeat message. The hypervisor then decides on the basis of the failover priorities of the remaining functioning vNIC backing devices which vNIC backing device with the associated logical port should be activated. In the case shown, the remaining vNIC backing devices have failover priorities 60 (ms03-vio2) and 55 (ms03-vio3). The vNIC backing device with the highest failover priority (lowest value), in this case the vNIC backing device on ms03-vio3 with priority 55, then becomes the new active vNIC backing device. The hypervisor deactivates the previously used vNIC backing device, activates the new vNIC backing device and reports the new vNIC backing device to be used to the vNIC client adapter of the LPAR.

The status of the vNIC backing devices is then as follows:

$ lpar lsvnic -a aix22
                           FAILOVER                                         PHYS  LOGICAL   CURRENT     MAX  
LPAR_NAME  SLOT  FAILOVER  PRIORITY  ACTV  STATUS       VIOS_NAME  ADAPTER  PORT  PORT      CAPACITY  CAPACITY
aix22  6     Yes       50        0     Link Down    ms03-vio1  1        0     27004005  2.0       100.0
aix22  6     Yes       60        0     Operational  ms03-vio2  2        0     27008004  2.0       100.0
aix22  6     Yes       55        1     Operational  ms03-vio3  3        2     2700c00a  2.0       100.0
$

The third vNIC backing device with failover priority 55 is now active (column ACTV), the originally active vNIC backing device now has the status “Link Down“. Of course, the vNIC server continues to monitor the failed logical port. This ensures that the vNIC server recognizes when the logical port is available again and reports this to the hypervisor accordingly.

When the link of the physical port comes up again, the status of the associated vNIC backing device changes back to operational. The question then arises as to whether or not to switch back to the vNIC backing device that is now available again. This can be configured via the attribute auto_priority_failover. The attribute has 2 possible values:

    • 0: automatic failover is disabled.
    • 1: automatic failover is activated.

If the auto_priority_failover attribute is set to 1, the vNIC backing device with the highest priority is always used. Since in our case the original vNIC backing device with a value of 50 has the highest failover priority, it will be reactivated immediately when it becomes available again:

$ lpar lsvnic -a aix22
                           FAILOVER                                         PHYS  LOGICAL   CURRENT     MAX  
LPAR_NAME  SLOT  FAILOVER  PRIORITY  ACTV  STATUS       VIOS_NAME  ADAPTER  PORT  PORT      CAPACITY  CAPACITY
aix22  6     Yes       50        1     Operational  ms03-vio1  1        0     27004005  2.0       100.0
aix22  6     Yes       60        0     Operational  ms03-vio2  2        0     27008004  2.0       100.0
aix22  6     Yes       55        0     Operational  ms03-vio3  3        2     2700c00a  2.0       100.0
$

The automatic failover does not only apply in this “failback” situation, but in general. This means that whenever a vNIC backing device has a higher failover priority, a failover to this vNIC backing device is carried out immediately. This can occur in the following situations:

    • Another vNIC backing device is added to a vNIC adapter with a higher failover priority than the active vNIC backing device.
    • The failover priority of an inactive vNIC backing device is changed, giving it a higher priority than the currently active vNIC backing device.
    • The failover priority of the active vNIC backing device is changed to a lower priority, which means that it no longer has the highest priority.
    • The active vNIC backing device (with the highest failover priority) fails.
    • A failed vNIC backing device with a higher failover priority than the active vNIC backing device becomes available again (link is up again).

Automatic priority failover means that the vNIC backing device with the highest priority is always active.