8.4. Error Report
The error report is the central collection point for error messages from AIX and therefore also for virtual I/O servers. All errors that the operating system detects are logged via the errdemon and can be inspected by the administrator at any time. The “vios errlog” command is used to display messages from the error report. If only a virtual I/O server is specified, a summary of all messages on the virtual I/O server in question is shown:
$ vios errlog ms13-vio1
IDENTIFIER TIMESTAMP TYPE CLASS RESOURCE_NAME DESCRIPTION
4B436A3D 0531052421 T H fscsi0 LINK ERROR
DC73C03A 0531051421 T S fscsi0 SOFTWARE PROGRAM ERROR
8C577CB6 0521111321 I S vnicserver0 VNIC Transport Event
60D73419 0521101121 I S vnicserver0 VNIC Client Login
E48A73A4 0521092321 I H ent45 BECOME PRIMARY
E15C5EAD 0520131421 T H ent37 Physical link up
F596EFAC 0520083421 T H ent37 Physical link down
E87EF1BE 0517150021 P O dumpcheck The largest dump device is too small.
8D424E06 0509095621 I H ent31 ADAPTER FAILURE
AA8AB241 0507075921 T O OPERATOR OPERATOR NOTIFICATION
F31FFAC3 0321142821 I H hdisk3 PATH HAS RECOVERED
DE3B8540 0321142321 P H hdisk3 PATH HAS FAILED
D5676F6F 0321142221 T H fscsi4 ATTACHED SCSI TARGET DEVICE ERROR
B8C78C08 0319122621 I H ent7 SEA HA PARTNER LOST
A6D1BD62 0319122221 I H unspecified Firmware Event
C62E1EB7 0314103021 P H hdisk4 DISK OPERATION ERROR
37F3CC40 0219145721 P U RMCdaemon RSCT has detected that system time has m
06DE59EC 1117194020 I U vhost0 Logging an informational error for VIO s
…
$
(Note: Many messages have been left out from the output, in order to show as many types of error messages as possible in a few lines.)
One error message is displayed per line, with the most important information such as time stamp (TIMESTAMP), type and class being given. The affected resource and a brief description are also displayed. The most recent error message is always the error message at the top. The number of messages to be shown can be restricted using the ‘-n‘ (number) option:
$ vios errlog -n 5 ms13-vio1
IDENTIFIER TIMESTAMP TYPE CLASS RESOURCE_NAME DESCRIPTION
4B436A3D 0531052421 T H fscsi0 LINK ERROR
4B436A3D 0531052421 T H fscsi0 LINK ERROR
4B436A3D 0531052421 T H fscsi0 LINK ERROR
4B436A3D 0531052421 T H fscsi0 LINK ERROR
4B436A3D 0531052421 T H fscsi0 LINK ERROR
$
Details can be displayed with the option ‘-a‘ (all information), in which case it is best to limit the number of messages displayed simultaneously with the option ‘-n‘. Otherwise the output can be extremely long:
$ vios errlog -n 1 -a ms13-vio1
---------------------------------------------------------------------------
LABEL: FCP_ERR4
IDENTIFIER: 4B436A3D
Date/Time: Mon May 31 05:24:00 2021
Sequence Number: 7342
Machine Id: 00CA09503A00
Node Id: ms13-vio1
Class: H
Type: TEMP
WPAR: Global
Resource Name: fscsi0
Resource Class: driver
Resource Type: emfscsi
Location: U78D3.001.VYR0AL4-P1-C2-T1
Description
LINK ERROR
Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
Detail Data
SENSE DATA
0000 0020 0000 0327 0000 0000 0203 0101 1000 0010 9BB9 32E1 0000 0000 008C 8240
…
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
$
Each message has a unique sequence number that is shown along with the detailed information, in the example 7342. This sequence number can be specified as an additional argument for “vios errlog” in order to filter out exactly one message. Unfortunately, this is not very practical because the sequence number is not included in the summary. (This is due to the underlying command on the virtual I/O server.)
If you want to select specific messages, the selection mechanism of the LPAR tool with the option ‘-s‘ is recommended. Any criteria can be used here to select what is ultimately to be displayed. For example, it is relatively easy to list all messages about a certain resource, here messages about hdisk3:
$ vios errlog -s resource_name=hdisk3 ms13-vio1
IDENTIFIER TIMESTAMP TYPE CLASS RESOURCE_NAME DESCRIPTION
F31FFAC3 0321142821 I H hdisk3 PATH HAS RECOVERED
DE3B8540 0321142321 P H hdisk3 PATH HAS FAILED
F31FFAC3 0321142221 I H hdisk3 PATH HAS RECOVERED
DE3B8540 0321141621 P H hdisk3 PATH HAS FAILED
F31FFAC3 0321123121 I H hdisk3 PATH HAS RECOVERED
DE3B8540 0321122521 P H hdisk3 PATH HAS FAILED
F31FFAC3 0321122421 I H hdisk3 PATH HAS RECOVERED
DE3B8540 0321121521 P H hdisk3 PATH HAS FAILED
F31FFAC3 0321110221 I H hdisk3 PATH HAS RECOVERED
DE3B8540 0321104921 P H hdisk3 PATH HAS FAILED
F31FFAC3 0321092721 I H hdisk3 PATH HAS RECOVERED
DE3B8540 0321091321 P H hdisk3 PATH HAS FAILED
$
As a circular log, the error report is restricted in size. After a while, old entries are automatically overwritten.
Every administrator of a PowerVM environment should always keep an eye on the error reports of all virtual I/O servers.