Managing and administrating service events is often forgotten on HMCs. In this article we want to use a concrete example, error with reference code #25B810, to show how to handle such events. Of course, our LPAR tool is used here.
First, let’s find all open service events:
$ hmc lssvcevents
TIME PROBLEM PMH HMC REFCODE STATE STATUS CALLHOME FAILING_MTMS TEXT
02/13/2019 23:02:31 7 - hmc01 #25B810 approved Open false 8231-E2B/06A084P File System alert event occurred...
02/16/2019 16:14:28 8 - hmc01 B3030001 approved Open false 8231-E2B/06A084P ACT04284I A Management Console connect failed
02/11/2019 16:12:43 37 - hmc02 B3030001 approved Open false 8231-E2B/06A084P ACT04284I A Management Console connect failed
02/11/2019 17:43:19 38 - hmc02 B3030001 approved Open false 8231-E2B/06A084P ACT04283I A connection to a FSP,BPA...
$
This article is about the problem with the number 7. The problem was noted on 13.02.2019 at 23:02:31, and examined by the HMC with the name hmc01. The error code is #25B810. The problem is in the “open” state, a call home has not been triggered. For further information, please refer to the problem on the managed system with serial number 06A084P, a Power 710 (8231-E2B). The beginning of the error message can be found in the last column.
First, let’s look at the whole record of the problem by specifying the problem number and HMC:
$ hmc lssvcevents -p 7 hmc01
analyzing_hmc: hmc01
analyzing_mtms: 7042-CR8/21009CD
approval_state: approved
callhome_intended: false
created_time: 02/14/2019 04:11:31
duplicate_count: 0
eed_transmitted: false
enclosure_mtms: 8231-E2B/06A084P
event_severity: 0
event_time: 02/13/2019 23:02:31
failing_mtms: 8231-E2B/06A084P
files: iqyymrge.log/Consolidated system platform log,
iqyvpd.dat/Configuration information associated with the HMC,
actzuict.dat/Tasks performed,
iqyvpdc.dat/Configuration information associated with the HMC,
problems.xml/XML version of the problems opened on the HMC for the HMC and the server,
refcode.dat/list of reference codes associated with the hmc,
iqyylog.log/HMC firmware log information,
PMap.eed/Partition map, obtained from 'lshsc -w -c machine',
hmc.eed/HMC code level obtained from 'lshmc -V' and connection information obtained from 'lssysconn -r all',
sys.eed/Output of various system configuration commands,
8231-E2B_06A084P.VPD.xml/Configuration information associated with the managed system
first_time: 02/14/2019 04:11:31
last_time: 02/14/2019 04:11:31
problem_num: 7
refcode: #25B810
reporting_mtms: 8231-E2B/06A084P
reporting_name: p710
status: Open
sys_mtms: 8231-E2B/06A084P
sys_name: p710
sys_refcode: #25B810
text: File System alert event occurred on /home/ios/CM/DB. Free space is less than 10%, or there was an error querying the filesystem.
At the end of the issue we find the unabbreviated error message. It’s about a file system that has less than 10% free space. The path “/home/ios/CM/DB” indicates a virtual I/O server. The relevant virtual I/O servers are located on the managed system with the serial number 06A084P:
$ ms show 06A084P
NAME SERIAL_NUM TYPE_MODEL HMCS
p710 06A084P 8231-E2B hmc01,hmc02
$
It is the managed system named, p710. The managed system includes the following virtual I/O servers:
$ vios -m p710 show
LPAR ID SERIAL LPAR_ENV MS HMCs
aixvio1 1 06A084P1 vioserver p710 hmc01,hmc02
$
A check of the error report on the Virtual I/O Server aixvio1 shows the following entry:
LABEL: VIO_ALERT_EVENT
IDENTIFIER: 0FD4CF1A
Date/Time: Wed Feb 13 22:02:31 CST 2019
Sequence Number: 98
Machine Id: 00F6A0844C00
Node Id: aixvio1
Class: O
Type: INFO
WPAR: Global
Resource Name: /home/ios/CM/DB
Description
Informational Message
Probable Causes
Asynchronous Event Occurred
Failure Causes
PROCESSOR
Recommended Actions
Check Detail Data
Detail Data
Alert Event Message
25b810
A File System alert event occurred on /home/ios/CM/DB. Free space is less than 10%, or there was an error querying the filesystem.
Diagnostic Analysis
Diagnostic Log sequence number: 19
Resource tested: sysplanar0
Menu Number: 25B810
Description:
File System alert event occurred on /home/ios/CM/DB. Free space is less than 10%, or there was an error querying the filesystem.
A quick check of the file system shows that the problem has already been resolved, and there is enough space:
$ df -g
Filesystem GB blocks Free %Used Iused %Iused Mounted on
...
/dev/hd1 0.25 0.16 35% 111 1% /home
...
$
So the problem does not exist anymore. Therefore, the service event on the HMC should also be closed, which we do now:
$ hmc chsvcevent -o close -p 7 hmc01
$
For review we list the open service events:
$ hmc lssvcevents
TIME PROBLEM PMH HMC REFCODE STATE STATUS CALLHOME FAILING_MTMS TEXT
02/16/2019 16:14:28 8 - hmc01 B3030001 approved Open false 8231-E2B/06A084P ACT04284I A Management Console connect failed
02/11/2019 16:12:43 37 - machmc B3030001 approved Open false 8231-E2B/06A084P ACT04284I A Management Console connect failed
02/11/2019 17:43:19 38 - machmc B3030001 approved Open false 8231-E2B/06A084P ACT04283I A connection to a FSP,BPA...
$
The event with the number 7 was closed successfully.
Service events are easy to manage with the LPAR tool!