AIX Archive - Page 4 of 6 - PowerCampus 01

June 22, 2020

Automating Inventory Scout

You haven’t used Inventory Scout yet or only occasionally ,manually? You want to run Inventory Scout automatically on all systems with as little effort as possible?

Then you should take a look at our article Automating Inventory Scout in the Article / AIX section. The article clearly describes which steps are necessary to run Inventory Scout automatically on number of systems. In the download area we have provided the script run_invscout, which is described in the article. The script allows you to download the latest catalog.mic file from IBM from a NIM server, copy it to any number of NIM clients, start an Inventory Scout run on the NIM clients and collect the microcode survey upload files from the NIM clients. Finally, the upload files are uploaded to IBM MDS for analysis.

April 29, 2020April 29, 2020

SUMA Proxy Configuration

There are a number of documents on the web for configuring SUMA with a proxy. Some of the older documents still describe the configuration of proxies using SUMA attributes:

# suma -c -a HTTP_PROXY=http://10.0.0.1:49000
0500-019 The -a flag entry HTTP_PROXY=http://10.0.0.1:49000 is not valid. (main, /usr/sbin/suma [767])
0500-009 An error occurred attempting to save configuration settings. (main, /usr/sbin/suma [768])
# suma -c -a HTTPS_PROXY=https://10.0.0.1:49000
0500-019 The -a flag entry HTTPS_PROXY=https://10.0.0.1:49000 is not valid. (main, /usr/sbin/suma [767])
0500-009 An error occurred attempting to save configuration settings. (main, /usr/sbin/suma [768])
# suma -c -a HELLO=world
0500-019 The -a flag entry HELLO=world is not valid. (main, /usr/sbin/suma [767])
0500-009 An error occurred attempting to save configuration settings. (main, /usr/sbin/suma [768])
#

Unfortunately, the attributes HTTP_PROXY and HTTPS_PROXY are no longer supported by SUMA. This is also documented in the SUMA manual page!

# man suma
...
HTTP_PROXY and HTTPS_PROXY
     Proxy server and port to use for the HTTP or HTTPS transfers. The SUMA command shares
     the proxy connectivity settings with the Electronic Service Agent\u2122. The HTTP or HTTPS
     proxy service configuration can be set up through the SMIT Create/Change Service
     Configuration menus (use fastpath smitty srv_conn) that allow the server
     specifications such as IP address, port number, and an optional user ID and password.
     SUMA no longer supports the settings of the HTTP_PROXY and HTTPS_PROXY parameters.

The proxy configuration is taken over from ESA (Electronic Service Agent). This means that proxies have to be configured via ESA. Therefore the fileset bos.ecc_client.rte must be installed. However, this should be already installed, as it is a prerequisite for bos.suma.

The proxies can be configured either using SMIT or via the command line. We first show the variant via the command line:

# /usr/ecc/bin/config_conn_path -c PRIMARY -y HTTP_PROXY -t YES -a 10.0.0.1 -x 49000
###########################################################

Testing HTTP Proxy Service Configuration

Performing HTTP Proxy Connectivity Test ... SUCCESS
#

The IP address of the proxy is specified with the option ‘–a‘, the port number with the option ‘–x‘. If a user is required for authentication, it can be specified with the option ‘–u‘ (the password is requested interactively). The option ‘–t YES‘ ensures that a connection test is made immediately, which was successful here.

In addition to the primary connection (PRIMARY), a secondary (SECONDARY) and tertiary (TERTIARY) connection can also be configured. The current configuration, e.g. for the primary connection, can be shown as follows:

# /usr/ecc/bin/config_conn_path -d PRIMARY
#type:ttyport:modem_type:primary_location:secondary_location:prefix:host:port:userid
HTTP_PROXY::::::10.0.0.1:49000:
#

The keyword ‘PRIMARY‘ can be abbreviated to ‘p‘.

A connection test can be started at any time using the option ‘–t YES‘ or ‘–t y‘:

# /usr/ecc/bin/config_conn_path -c p -t y
###########################################################

Testing HTTP Proxy Service Configuration

Performing HTTP Proxy Connectivity Test ... SUCCESS
#

Alternatively, you can also use SMIT:

# smitty configure_primary
…
                       Create/Change Primary Service Configuration

Type or select values in entry fields.
Press Enter AFTER making all desired changes.


                                                        [Entry Fields]
  Connection type                                    [HTTP_Proxy]                     +
  Test service configuration                         [No]                             +

  If type is DIRECT_INTERNET, no entry required.

  If type is HTTP_PROXY,
          IP address                                 [10.0.0.1]
          Port number                                [49000]                            #
          Authentication user ID                     []
          Authentication password requested interact
  ively.


F1=Help               F2=Refresh            F3=Cancel             F4=List
F5=Reset              F6=Command            F7=Edit               F8=Image
F9=Shell              F10=Exit              Enter=Do

A short test shows that SUMA works with the configured proxies:

# suma -x -a DisplayName=Test -a Action=Preview -a RqType=Latest
**************************************** (main, /usr/sbin/suma [990])
Performing preview download. (main, /usr/sbin/suma [991])
**************************************** (main, /usr/sbin/suma [992])
Partition id was unassigned; will attempt to assign it.
Partition id assigned value 6
Download SUCCEEDED: /export/nim/suma/installp/ppc/U861910.bff (main, /usr/sbin/suma [1048])
Download SUCCEEDED: /export/nim/suma/installp/ppc/U861907.bff (main, /usr/sbin/suma [1048])
Download SUCCEEDED: /export/nim/suma/installp/ppc/U861904.bff (main, /usr/sbin/suma [1048])
…
Total bytes of updates downloaded: 899671552 (main, /usr/sbin/suma [1048])
Summary: (main, /usr/sbin/suma [1048])
        367 downloaded (main, /usr/sbin/suma [1048])
        0 failed (main, /usr/sbin/suma [1048])
        34 skipped (main, /usr/sbin/suma [1048])
DEBUG: Closing file handles (SUMA::Messenger, /usr/suma/lib/SUMA/Messenger.pm [401])
#

April 28, 2020

AIX: Problems with Run-Time Linking

Probably everyone has had at least once problems with AIX and run-time linking. For example if a program aborts with the following error when called:

$ progXY
exec(): 0509-036 Cannot load program progXY because of the following errors:
        0509-150   Dependent module libXXX.a(shr.o) could not be loaded.
        0509-022 Cannot load module libXXX.a(shr.o).
        0509-026 System error: A file or directory in the path name does not exist.
$

In our article AIX and Run-Time Linking, we explain the processes involved in run-time linking and possible errors. We show how to examine executable programs and libraries with the dump command. The meaning of the variables LIBPATH and LD_LIBRARY_PATH for dynamic linking is also explained.

January 29, 2020March 14, 2020

WWPN of FC ports in Open Firmware

The following article deals with WWPN of FC ports in Open Firmware.

Port and node WWNs of FC ports can be found very easily in the Open Firmware, even when the ioinfo command is no longer available, as is the case with new POWER9 firmware. The hardware structure of a POWER system is available in the Open Firmware in the form of a device tree. Hardware components such as PCI bridges, processors and PCI cards are represented as device nodes in this tree.

With the command “dev /” you can access the device nodes, starting with the root node (“/” or slash):

0 > dev /  ok
0 >

In the device tree you can navigate with the commands dev, ls and pwd similar to the Unix file system. An ls on the root node shows all available device nodes (as well as some “package nodes” which are not discussed here).

The hierarchy is visualized in the device tree by indenting the device nodes:

0 > ls 
0000020939c0: /ibm,serial
000002094ae8: /chosen
000002094d60: /packages
000002094e58:   /disassembler
...0000020af578: /cpus
0000020b5200:   /PowerPC,POWER7@0
...
0000020ba640: /memory@0
...
00000226cad0: /pci@800000020000120
00000229d750:   /pci@0
0000022a0018:     /pci@2
0000022a28e0:       /ethernet@0
0000022b4a28:       /ethernet@0,1
0000022c6b70:     /pci@4
0000022c9438:       /ethernet@0
0000022db580:       /ethernet@0,1
000002277fd8: /pci@800000020000121
0000022ed7d0:   /fibre-channel@0
0000023026e0:     /fp
000002303240:     /disk
000002304de0:     /tape
000002306270:   /fibre-channel@0,1
00000231b180:     /fp
00000231bce0:     /disk
00000231d880:     /tape
...
ok
0 >

The example output shows 2 FC ports. Both FC ports are children of the device node pci@800000020000121, which can be found directly under the root node /.

With the command “dev / pci@800000020000121” we first navigate to this node and then display the child or child nodes using “ls“:

0 > dev /pci@800000020000121  ok
0 > ls
0000022ed7d0: /fibre-channel@0
0000023026e0:   /fp
000002303240:   /disk
000002304de0:   /tape
000002306270: /fibre-channel@0,1
00000231b180:   /fp
00000231bce0:   /disk
00000231d880:   /tape
ok
0 >

We next move into the device node of the first FC port fiber-channel@0.

With the command “pwd” we check briefly the position in the device tree and then use “ls” to look at the available subnodes:

0 > dev fibre-channel@0  ok
0 > pwd /pci@800000020000121/fibre-channel@0 ok
0 > ls
0000023026e0: /fp
000002303240: /disk
000002304de0: /tape
ok
0 >

Each device node has a number of properties, which depend on the type of the underlying hardware component.

The properties of a device node can be displayed with the command “.properties” (the command name begins with a “.“):

0 > .properties
ibm,loc-code            U5802.001.008C110-P1-C2-T1
vendor-id               000010df
device-id               0000f100
...
name                    fibre-channel
...
manufacturer            456d756c 657800
copyright               436f7079 72696768 74202863 29203230 30302d32 30313220 456d756c 657800
device_type             fcp
model                   10N9824
...
port-wwn                10000000 c9b12345
node-wwn                20000000 c9b12345
...
ok
0 >

In addition to the location code, the port WWN (port-wwn) and the node WWN (node-wwn) are displayed.

If you would like to know more about the structure of WWNs, please refer to the article: Numbers: FC World Wide Names (WWNs)

Of course, you can also find out the MAC address of an ethernet port in the same way. With “dev ..” you can move up one level in the device tree, just like in a Unix file system. But you can also abbreviate and go straight to the top, which we do here in the following. Then we display all available device nodes again:

0 > dev /  ok
0 > ls 
...
00000226cad0: /pci@800000020000120
00000229d750:   /pci@0
0000022a0018:     /pci@2
0000022a28e0:       /ethernet@0
0000022b4a28:       /ethernet@0,1
0000022c6b70:     /pci@4
0000022c9438:       /ethernet@0
0000022db580:       /ethernet@0,1
...
ok
0 >

As an example, we select the device node /pci@800000020000120/pci@0/pci@2/ethernet@0.1 and again let us display the properties:

0 > dev /pci@800000020000124/pci@0/pci@2/ethernet@0,1  ok
0 > pwd /pci@800000020000124/pci@0/pci@2/ethernet@0,1 ok
0 > .properties
ibm,loc-code            U5802.001.008C110-P1-C4-T2
vendor-id               00008086
device-id               000010bc
...
name                    ethernet
...
device_type             network
...
max-frame-size          00000800
address-bits            00000030
local-mac-address       00145eea 1234
mac-address             00145eea 1234
...
0 >

The MAC address is available here by the property mac-address.

If you want to leave the device tree, you can do this with the command “device-end“:

0 > device-end  ok
0 >

We hope this article about WWPN of FC ports in Open Firmware was both helpful and informative.

December 2, 2019March 12, 2020

LLDP on the virtual-I/O-Server

There are 2 new articles concerning LLDP on the virtual-I/O-server. Both configuration(Virtual I/O Server: Configuring LLDP) and troubleshooting(Virtual I/O Server: Troubleshooting LLDP) are covered here.

Shown are the daemon lldpd, the commands lldpctl and lldpsync, as well as the recording of LLDP packets using tcpdump.

October 28, 2019

Error: 0509-022 Cannot load module

Recently, we have installed yum from the AIX Toolbox (www.ibm.com/support/pages/aix-toolbox-linux-applications-overview) on some of our systems. So far, we have installed RPM packages from Perzl (www.perzl.org). When upgrading to yum, some RPMs were updated with versions from the AIX Toolbox. For some of the RPMs, there were problems with dynamic libraries afterwards.

Here we show one of the issues with the curl program:

$ curl
exec(): 0509-036 Cannot load program curl because of the following errors:
        0509-022 Cannot load module /opt/freeware/lib/libcurl.a(libcurl.so.4).
        0509-150   Dependent module /opt/freeware/lib/libcrypto.a(libcrypto.so) could not be loaded.
        0509-152   Member libcrypto.so is not found in archive 
        0509-022 Cannot load module curl.
        0509-150   Dependent module /opt/freeware/lib/libcurl.a(libcurl.so.4) could not be loaded.
        0509-022 Cannot load module .
$

Curl was working before installing the RPM packages. The command ldd for listing dynamic dependencies provides a similar error message:

$ ldd /usr/bin/curl
/usr/bin/curl needs:
         /usr/lib/libc.a(shr.o)
         /opt/freeware/lib/libcurl.a(libcurl.so.4)
         /opt/freeware/lib/libz.a(libz.so.1)
         /unix
         /usr/lib/libcrypt.a(shr.o)
         /opt/freeware/lib/libcrypto.a(libcrypto.so)
ar: 0707-109 Member name libcrypto.so does not exist.
dump: /tmp/tmpdir31457524/extract/libcrypto.so: 0654-106 Cannot open the specified file.
         /opt/freeware/lib/libssl.a(libssl.so)
ar: 0707-109 Member name libssl.so does not exist.
dump: /tmp/tmpdir31457524/extract/libssl.so: 0654-106 Cannot open the specified file.
$

The file /opt/freeware/lib/libcrypto.a is an archive and the error message says that the shared object libcrypto.so does not exist in this archive. This can easily be checked using the command ar:

$ ar -X any tv /opt/freeware/lib/libcrypto.a
rwxr-xr-x     0/0     3036810 Apr 08 18:46 2014 libcrypto.so.1.0.1
rwxr-xr-x     0/0     3510308 Apr 08 18:41 2014 libcrypto.so.1.0.1
rw-r--r--     0/0     2012251 Apr 08 18:46 2014 libcrypto.so.0.9.7
rw-r--r--     0/0     2491620 Apr 08 18:46 2014 libcrypto.so.0.9.8
rwxr-xr-x     0/0     2921492 Apr 08 18:46 2014 libcrypto.so.1.0.0
rw-r--r--     0/0     2382757 Apr 08 18:46 2014 libcrypto.so.0.9.7
rw-r--r--     0/0     2923920 Apr 08 18:46 2014 libcrypto.so.0.9.8
rwxr-xr-x     0/0     3381316 Apr 08 18:46 2014 libcrypto.so.1.0.0
$

(The option “-X any” ensures that both 32– and 64-bit objects are displayed)

The shared library libcrypto.so is obviously not in the archive! The archive is part of Perzl’s OpenSSL package:

$ rpm -qf /opt/freeware/lib/libcrypto.a
openssl-1.0.1g-1.ppc
$

The OpenSSL package is the cause of the problems when calling curl and possibly other programs. Perzl’s packages have an OpenSSL package, which provides OpenSSL at /opt/freeware. The IBM packages from the AIX Toolbox, however, use the operating system’s OpenSSL under /usr/lib. There is no OpenSSL RPM package in the AIX Toolbox. Perzl’s OpenSSL package should be removed.

# rpm –e openssl
#

Of course, if you have RPM packages (Perzl) that depend on OpenSSL, removing the OpenSSL RPM will fail:

# rpm -e openssl
error: Failed dependencies:
            openssl >= 0.9.8 is needed by (installed) openldap-2.4.23-0.3.ppc
#

In this case, there are 2 options: either uninstall the dependent package as well, or update the package with a version from the AIX toolbox. Here we briefly show the second case:

# yum update openldap
AIX_Toolbox                                                                                   | 2.9 kB  00:00:00    
AIX_Toolbox_71                                                                                | 2.9 kB  00:00:00    
AIX_Toolbox_noarch                                                                            | 2.9 kB  00:00:00    
Setting up Update Process
Resolving Dependencies
--> Running transaction check
---> Package openldap.ppc 0:2.4.23-0.3 will be updated
---> Package openldap.ppc 0:2.4.46-1 will be an update
...

Is this ok [y/N]: y

...

Updated:

  openldap.ppc 0:2.4.46-1                                                                                           

Complete!
#

Afterwards, the OpenSSL RPM can be uninstalled:

# rpm -e openssl
#

Now curl can be started again without a problem:

$ curl
curl: try 'curl --help' or 'curl --manual' for more information
$

Conclusion: When switching to the AIX Toolbox, any existing OpenSSL RPM should be uninstalled!

October 11, 2019

Virtual Processor Folding

A simple and direct way to see which processors are active and which processors are not available due to “virtual processor folding” is the kernel debugger kdb.

The following command displays the desired information:

# echo vpm | kdb
...
VSD Thread State.
CPU  CPPR VP_STATE FLAGS SLEEP_STATE  PROD_TIME: SECS   NSECS     CEDE_LAT

   0     0  ACTIVE      0 AWAKE        0000000000000000  00000000  00  
   1     0  ACTIVE      0 AWAKE        0000000000000000  00000000  00  
   2     0  ACTIVE      0 AWAKE        0000000000000000  00000000  00  
   3     0  ACTIVE      0 AWAKE        0000000000000000  00000000  00  
   4     0  DISABLED    0 AWAKE        0000000000000000  00000000  00  
   5    11  DISABLED    0 SLEEPING     000000005D9F15BE  0F77D2C6  02  
   6    11  DISABLED    0 SLEEPING     000000005D9F15BE  0F77D0C8  02  
   7    11  DISABLED    0 SLEEPING     000000005D9F15BE  217D3F61  02  

#

The output was created on a system with 2 processor cores and SMT4. CPUs 4-7 are currrently DISABLED, which is the second processor core.

September 25, 2019March 14, 2020

Building your own AIX installp packages (Part 1)

This blog post will show you how to build installp packages on AIX. The use of the command mkinstallp is demonstrated by a simple example. More complex packages will be covered in a later article.

It is relatively easy to build installp packages under AIX. This requires the command /usr/sbin/mkinstallp. If the command is not installed, the package bos.adt.insttools must be installed. Building a package requires root privileges.

First of all, we create a directory somewhere, in which the package is to be built:

# mkdir pwrcmps.installp.example
#

The package should contain a small shell script:

# cat <<EOF >hello
#! /bin/ksh
print "hello world"
EOF
# chmod a+x hello
#

The script will later be installed under /usr/local/bin. Files to be installed must be in the same place relative to the build directory (pwrcmps.installp.example in our case), as later relative to the root directory. We therefore create the necessary directory structure and copy the hello script to the appropriate location:

# mkdir –p pwrcmps.installp.example/usr/local/bin
# cp hello pwrcmps.installp.example/usr/local/bin
#

We change to the build directory and start the command mkinstallp to build the package:

# cd pwrcmps.installp.example
# mkinstallp
Using /export/src/installp/pwrcmps.installp.example as the base package directory.
Cannot find /export/src/installp/pwrcmps.installp.example/.info. Attempting to create.
Using /export/src/installp/pwrcmps.installp.example/.info to store package control files.
Cleaning intermediate files from /export/src/installp/pwrcmps.installp.example/.info.

************************************************************
|            Beginning interactive package input           |
|   * - required; () - example value; [] - default value   |
************************************************************

* Package Name (xyz.net) []: pwrcmps.installp
* Package VRMF (1.0.0.0) []: 1.0.0.0
Update (Y/N) [N]:
Number of filesets in pwrcmps.installp (1) [1]:

Gathering info for new fileset (1 remaining)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
* Fileset Name (pwrcmps.installp.rte) []: pwrcmps.installp.example
* Fileset VRMF (1.0.0.0) []: 1.0.0.0
* Fileset Description (some text) []: Example Fileset
Do you want to include a copyright file for this fileset? (Y/N) [N]:
Entering information for the USER part liblpp files
Do you want to include an installp pre_i/u script for this fileset? (Y/N) [N]:
Do you want to include an installp post_i/u script for this fileset? (Y/N) [N]:
Do you want to include a pre-deinstall Script for this fileset? (Y/N) [N]:
Do you want to include an installp pre_rm script for this fileset? (Y/N) [N]:
Do you want to include an installp config script for this fileset? (Y/N) [N]:
Would you like to create ROOT part? (Y/N) [N]:
Bosboot required (Y/N) [N]:
License agreement acceptance required (Y/N) [N]:
Include license files for pwrcmps.installp.example in this package (Y/N) [N]:
Do you want to specify Requisites using a file for this fileset? (Y/N) [N]:
Number of Requisites for pwrcmps.installp.example (1) [0]:
Number of filesystems requiring additional space for pwrcmps.installp.example [0]:

You should include any directories that you are creating in the file count.
(ie: For /usr/proj/myFile, enter 2; 1 for /usr/proj and 1 for /usr/proj/myFile)
Number of USR part files in pwrcmps.installp.example (1) [0]: 1
* 1 of 1. Directory or File Path (/usr/proj/myFile) []: /usr/local/bin/hello
Do you want to make this fileset relocatable? (Y/N) [N]:
Do you want to include an override inventory file for this fileset? (Y/N) [N]:
Do you want to add WPAR attributes to your fileset? (Y/N) [N]:


Using /export/src/installp/pwrcmps.installp.example/.info/pwrcmps.installp.template as the template file.
pwrcmps.installp 1.0.0.0 I
processing pwrcmps.installp.example
creating ./.info/liblpp.a
creating ./tmp/pwrcmps.installp.1.0.0.0.bff
#

The built end product can be found in the subdirectory tmp:

# ls –l tmp
total 8
-rw-r--r--    1 root     system         2560 Sep 25 09:49 pwrcmps.installp.1.0.0.0.bff
#

When creating the package, we have always confirmed the default values for simplicity. The product name is pwrcmps.installp and the file set name is pwrcmps.installp.example, each with version 1.0.0.0. All files and directories of the package must be listed explicitly with absolute path! This can be a bit impractical for a few hundred paths, but it can be simplified and automated by using scripts.

We install the newly generated package once, and check if our shell script is also installed:

# installp -ad tmp/pwrcmps.installp.1.0.0.0.bff all
+-----------------------------------------------------------------------------+
                    Pre-installation Verification...
+-----------------------------------------------------------------------------+
Verifying selections...done
Verifying requisites...done
Results...

SUCCESSES
---------
  Filesets listed in this section passed pre-installation verification
  and will be installed.

  Selected Filesets
  -----------------
  pwrcmps.installp.example 1.0.0.0            # Example Fileset

  << End of Success Section >>

+-----------------------------------------------------------------------------+
                   BUILDDATE Verification ...
+-----------------------------------------------------------------------------+
Verifying build dates...done
FILESET STATISTICS
------------------
    1  Selected to be installed, of which:
        1  Passed pre-installation verification
 ----
    1  Total to be installed

+-----------------------------------------------------------------------------+
                         Installing Software...
+-----------------------------------------------------------------------------+

installp:  APPLYING software for:
        pwrcmps.installp.example 1.0.0.0

Finished processing all filesets.  (Total time:  1 secs).

+-----------------------------------------------------------------------------+
                                Summaries:
+-----------------------------------------------------------------------------+

Installation Summary
--------------------
Name                        Level           Part        Event       Result
-------------------------------------------------------------------------------
pwrcmps.installp.example    1.0.0.0         USR         APPLY       SUCCESS   
# which hello
/usr/local/bin/hello
# hello
hello world
#

We are now extending our script by printing the message “hello world, how are you“.

# vi usr/local/bin/hello
…
print "hello world, how are you"
#

The package should now be rebuilt, before we want to increase the version to 1.1.0.0. Of course, we could start the command mkinstallp interactively again and enter all the necessary information again, but this is very time consuming and not necessary. The command mkinstallp supports the specification of a template file with all necessary information. When building the first version of our package, such a template file was generated under .info/pwrcmps.installp.template. We copy this template file directly into the build root directory and rename it to template:

# cp .info/pwrcmps.installp.template template
# cat template
Package Name: pwrcmps.installp
Package VRMF: 1.0.0.0
Update: N
Fileset
  Fileset Name: pwrcmps.installp.example
  Fileset VRMF: 1.0.0.0
  Fileset Description: Example Fileset
  USRLIBLPPFiles
  EOUSRLIBLPPFiles
  Bosboot required: N
  License agreement acceptance required: N
  Include license files in this package: N
  Requisites:
  USRFiles
    /usr/local/bin/hello
  EOUSRFiles
  ROOT Part: N
  ROOTFiles
  EOROOTFiles
  Relocatable: N
EOFileset
#

The template file contains the information that we gave interactively. In this template file we change the version number from 1.0.0.0 to 1.1.0.0:

# vi template
…
Package VRMF: 1.1.0.0
…
  Fileset VRMF: 1.1.0.0
…
#

We now try to build the version 1.1.0.0 by starting mkinstallp with the option -T (for template) and the template file template:

# mkinstallp -T template
Using /export/src/installp/pwrcmps.installp.example as the base package directory.
Using /export/src/installp/pwrcmps.installp.example/.info to store package control files.
Cleaning intermediate files from /export/src/installp/pwrcmps.installp.example/.info.
0503-844 mkinstallp: Cannot overwrite existing
        /export/src/installp/pwrcmps.installp.example/.info/pwrcmps.installp.template file.
#

An error message appears, the template file under .info can not be overwritten. The .info directory is created again and again during the build process and can therefore simply be deleted before the next build process is started:

# rm -rf .info
# mkinstallp -T template
Using /export/src/installp/pwrcmps.installp.example as the base package directory.
Cannot find /export/src/installp/pwrcmps.installp.example/.info. Attempting to create.
Using /export/src/installp/pwrcmps.installp.example/.info to store package control files.
Cleaning intermediate files from /export/src/installp/pwrcmps.installp.example/.info.

Using template as the template file.
0503-880 mkinstallp: This file /usr/local/bin/hello
        already exists as a system file.
pwrcmps.installp 1.1.0.0 I
processing pwrcmps.installp.example
creating ./.info/liblpp.a
creating ./tmp/pwrcmps.installp.1.1.0.0.bff
#

The new version of the package can be found under tmp as before:

# ls -l tmp
total 16
-rw-r--r--    1 root     system         2560 Sep 25 10:08 pwrcmps.installp.1.0.0.0.bff
-rw-r--r--    1 root     system         2560 Sep 25 10:35 pwrcmps.installp.1.1.0.0.bff
#

This makes it easy to make changes and then package them.

August 1, 2019

TCP connection aborts due to “max assembly queue depth”

Recently we had frequent program crashes from some Java client programs. The following was found in the Java stack trace:

[STACK] Caused by: java.io.IOException: Premature EOF
[STACK]           at sun.net.www.http.ChunkedInputStream.readAheadBlocking(Unknown Source)
[STACK]           at sun.net.www.http.ChunkedInputStream.readAhead(Unknown Source)
[STACK]           at sun.net.www.http.ChunkedInputStream.read(Unknown Source)
[STACK]           at java.io.FilterInputStream.read(Unknown Source)

The problem occurs in the ChunkedInputStream class in the readAheadBlocking method. In the source code of the method you will find:

558 /**
559 * If we hit EOF it means there's a problem as we should never
560 * attempt to read once the last chunk and trailers have been
561 * received.
562 */
563 if (nread < 0) {
564 error = true;
565 throw new IOException("Premature EOF");
566 }

The value nread becomes less than 0 when the end of the data stream is reached. This can happen if the other party closes the connection unexpectedly.

The server side in this case was an AIX system (AIX 7.1 TL5 SP3). A review of the TCP connections for dropdowns using netstat resulted in:

$ netstat -p tcp | grep drop
        361936 connections closed (including 41720 drops)
        74718 embryonic connections dropped
                0 connections dropped by rexmit timeout
                0 connections dropped due to persist timeout
                0 connections dropped by keepalive
        0 packets dropped due to memory allocation failure
        0 Connections dropped due to bad ACKs
        0 Connections dropped due to duplicate SYN packets
        1438 connections dropped due to max assembly queue depth
$

Thus, there were 1438 connection drops because of the maximum TCP assembly queue depth. The queue depth is configured via the new kernel parameter tcp_maxqueuelen, which was introduced as a fix for CVE-2018-6922 (see: The CVE-2018-6922 fix (FreeBSD vulnerability) and scp). The default value is 1000. With larger packet runtimes, the queue may overflow.

After increasing the kernel parameter tcp_maxqueuelen, no connection drops because of max assembly queue have occurred anymore.

July 13, 2019April 15, 2021

ProbeVue in Action: Monitoring the Queue Depth of Disks

Disk and storage systems support Tagged Command Queuing, i.e. connected servers can send multiple I/O jobs to the disk or storage system without waiting for older I/O jobs to finish. The number of I/O requests you can send to a disk before you have to wait for older I/O requests to complete can be configured using the hdisk queue_depth attribute on AIX. For many hdisk types, the value 20 for the queue_depth is the default value. In general, most storage systems allow even greater values for the queue depth.

With the help of ProbeVue, the utilization of the disk queue can be monitored very easily.

Starting with AIX 7.1 TL4 or AIX 7.2 TL0, AIX supports the I/O Probe Manager. This makes it easy to trace events in AIX’s I/O stack. If an I/O is started by the disk driver, this is done via the iostart function in the kernel, the request is forwarded to the adapter driver and then passed to the storage system via the host bus adapter. Handling the response is done by the iodone function in the kernel. The I/O Probe Manager supports (among others) probe events at these locations:

@@io:disk:iostart:read:<filter>
@@io::disk:iostart:write:<filter>
@@io:disk:iodone:read:<filter>
@@io::disk:iodone:write:<filter>

As a filter, e.g. a hdisk name like hdisk2 can be specified. The probe points then only trigger events for the disk hdisk2. This allows to perform an action whenever an I/O for a hdisk begins or ends. This would allow to measure how long an I/O operation takes or just to count how many I/Os are executed. In our example, we were interested in the utilization of the disk queue, i.e. the number of I/Os sent to the disk which are not yet completed. The I/O Probe Manager has a built-in variable __diskinfo for the iostart and iodone I/O probe events with the following fields (https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/com.ibm.aix .genprogc / probevue_man_io.htm):

name          char*     Name of the disk.
…
queue_depth   int       The queue depth of the disk (value from ODM)
cmds_out      int       Number of outstanding I/Os
…

The cmds_out field indicates how many I/Os have already been sent to the disk for which the I/O has not yet been completed (response has not yet arrived at the server).

The following section of code determines the minimum, maximum, and average number of entries in the disk queue:

@@io:disk:iostart:*:hdisk0     // Only I/Os for hdisk0 are considered
{
   queue = __iopath->cmds_out; // Store number of outstanding I/Os in variable queue
   ++numIO;                    // Number of I/Os (used for calculating the average)
   avg += queue;               // Add number of outstanding I/Os to variable avg
   if ( queue < min )
      min = queue;             // Check if minimum
   if ( queue > max )
      max = queue;             // Check if maximum
}

The calculated values are then printed once per second using the interval probe manager:

@@interval:*:clock:1000
{
   if ( numIO == 0 )
      numIO = 1;    // Prevent division by 0 when calculating the average
   if ( min > max )
      min = max;
   printf( "%5d  %5d  %5d\n" , min , avg/numIO , max );
   min = 100000;   // Reset variables for the next interval
   avg = 0;
   max = 0;
   numIO = 0;
}

The full script is available for download on our website: ioqueue.e.

Here is a sample run of the script for the disk hdisk13:

# ./ioqueue.e hdisk13
  min    avg    max
    1      1      2
    1      1      9
    1      1      2
    1      1      8
    1      1      2
    1      1      2
    1      1      8
    1      1     10
    1      1      2
    1      1      1
    1      1     10
    1      1      2
    1      1     11
...

The script expects an hdisk as an argument, and then outputs once per second the values determined for the specified hdisk.

In the example output you can see that the maximum number of entries in the disk queue is 11. An increase of the attribute queue_depth therefore makes no sense from a performance perspective.

Here’s another example:

# ./ioqueue.e hdisk21
  min    avg    max
    9     15     20
   11     17     20
   15     19     20
   13     19     20
   14     19     20
   17     18     20
   18     18     19
   16     19     20
   13     18     20
   18     19     19
   17     19     20
   18     19     20
   17     19     19
...

In this case, the maximum value of 20 (the hdisk21 has a queue_depth of 20) is reached on a regular basis. Increasing the queue_depth can improve throughput in this case.

Of course, the sample script can be expanded in various ways; to determine the throughput in addition, or the waiting time of I/Os in the wait queue, or even the position and size of each I/O on the disk. This example just shows how easy it is to get information about I/Os using ProbeVue.

Category: AIX