8.4. Error Report

The error report is the central collection point for error messages from AIX and therefore also for virtual I/O servers. All errors that the operating system detects are logged via the errdemon and can be inspected by the administrator at any time. The “vios errlog” command is used to display messages from the error report. If only a virtual I/O server is specified, a summary of all messages on the virtual I/O server in question is shown:

$ vios errlog ms13-vio1
IDENTIFIER  TIMESTAMP   TYPE  CLASS  RESOURCE_NAME  DESCRIPTION
4B436A3D    0531052421  T     H      fscsi0         LINK ERROR
DC73C03A    0531051421  T     S      fscsi0         SOFTWARE PROGRAM ERROR
8C577CB6    0521111321  I     S      vnicserver0    VNIC Transport Event
60D73419    0521101121  I     S      vnicserver0    VNIC Client Login
E48A73A4    0521092321  I     H      ent45          BECOME PRIMARY
E15C5EAD    0520131421  T     H      ent37          Physical link up
F596EFAC    0520083421  T     H      ent37          Physical link down
E87EF1BE    0517150021  P     O      dumpcheck      The largest dump device is too small.
8D424E06    0509095621  I     H      ent31          ADAPTER FAILURE
AA8AB241    0507075921  T     O      OPERATOR       OPERATOR NOTIFICATION
F31FFAC3    0321142821  I     H      hdisk3         PATH HAS RECOVERED
DE3B8540    0321142321  P     H      hdisk3         PATH HAS FAILED
D5676F6F    0321142221  T     H      fscsi4         ATTACHED SCSI TARGET DEVICE ERROR
B8C78C08    0319122621  I     H      ent7           SEA HA PARTNER LOST
A6D1BD62    0319122221  I     H      unspecified    Firmware Event
C62E1EB7    0314103021  P     H      hdisk4         DISK OPERATION ERROR
37F3CC40    0219145721  P     U      RMCdaemon      RSCT has detected that system time has m
06DE59EC    1117194020  I     U      vhost0         Logging an informational error for VIO s

$

 
(Note: Many messages have been left out from the output, in order to show as many types of error messages as possible in a few lines.)

One error message is displayed per line, with the most important information such as time stamp (TIMESTAMP), type and class being given. The affected resource and a brief description are also displayed. The most recent error message is always the error message at the top. The number of messages to be shown can be restricted using the ‘-n‘ (number) option:

$ vios errlog -n 5 ms13-vio1
IDENTIFIER  TIMESTAMP   TYPE  CLASS  RESOURCE_NAME  DESCRIPTION
4B436A3D    0531052421  T     H      fscsi0         LINK ERROR
4B436A3D    0531052421  T     H      fscsi0         LINK ERROR
4B436A3D    0531052421  T     H      fscsi0         LINK ERROR
4B436A3D    0531052421  T     H      fscsi0         LINK ERROR
4B436A3D    0531052421  T     H      fscsi0         LINK ERROR
$

Details can be displayed with the option ‘-a‘ (all information), in which case it is best to limit the number of messages displayed simultaneously with the option ‘-n‘. Otherwise the output can be extremely long:

$ vios errlog -n 1 -a ms13-vio1
---------------------------------------------------------------------------
LABEL:           FCP_ERR4
IDENTIFIER:     4B436A3D
 
Date/Time:       Mon May 31 05:24:00 2021
Sequence Number: 7342
Machine Id:      00CA09503A00
Node Id:         ms13-vio1
Class:           H
Type:            TEMP
WPAR:            Global
Resource Name:   fscsi0
Resource Class:  driver
Resource Type:   emfscsi
Location:        U78D3.001.VYR0AL4-P1-C2-T1
 
 
Description
LINK ERROR
 
            Recommended Actions
            PERFORM PROBLEM DETERMINATION PROCEDURES
 
Detail Data
SENSE DATA
0000 0020 0000 0327 0000 0000 0203 0101 1000 0010 9BB9 32E1 0000 0000 008C 8240

0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
$

Each message has a unique sequence number that is shown along with the detailed information, in the example 7342. This sequence number can be specified as an additional argument for “vios errlog” in order to filter out exactly one message. Unfortunately, this is not very practical because the sequence number is not included in the summary. (This is due to the underlying command on the virtual I/O server.)

If you want to select specific messages, the selection mechanism of the LPAR tool with the option ‘-s‘ is recommended. Any criteria can be used here to select what is ultimately to be displayed. For example, it is relatively easy to list all messages about a certain resource, here messages about hdisk3:

$ vios errlog -s resource_name=hdisk3 ms13-vio1
IDENTIFIER  TIMESTAMP   TYPE  CLASS  RESOURCE_NAME  DESCRIPTION
F31FFAC3    0321142821  I     H      hdisk3         PATH HAS RECOVERED
DE3B8540    0321142321  P     H      hdisk3         PATH HAS FAILED
F31FFAC3    0321142221  I     H      hdisk3         PATH HAS RECOVERED
DE3B8540    0321141621  P     H      hdisk3         PATH HAS FAILED
F31FFAC3    0321123121  I     H      hdisk3         PATH HAS RECOVERED
DE3B8540    0321122521  P     H      hdisk3         PATH HAS FAILED
F31FFAC3    0321122421  I     H      hdisk3         PATH HAS RECOVERED
DE3B8540    0321121521  P     H      hdisk3         PATH HAS FAILED
F31FFAC3    0321110221  I     H      hdisk3         PATH HAS RECOVERED
DE3B8540    0321104921  P     H      hdisk3         PATH HAS FAILED
F31FFAC3    0321092721  I     H      hdisk3         PATH HAS RECOVERED
DE3B8540    0321091321  P     H      hdisk3         PATH HAS FAILED
$

As a circular log, the error report is restricted in size. After a while, old entries are automatically overwritten.

Every administrator of a PowerVM environment should always keep an eye on the error reports of all virtual I/O servers.