ProbeVue in Action: Monitoring the Queue Depth of Disks

Disk and storage systems support Tagged Command Queuing, i.e. connected servers can send multiple I/O jobs to the disk or storage system without waiting for older I/O jobs to finish. The number of I/O requests you can send to a disk before you have to wait for older I/O requests to complete can be configured using the hdisk queue_depth attribute on AIX. For many hdisk types, the value 20 for the queue_depth is the default value. In general, most storage systems allow even greater values for the queue depth.

With the help of ProbeVue, the utilization of the disk queue can be monitored very easily.

Starting with AIX 7.1 TL4 or AIX 7.2 TL0, AIX supports the I/O Probe Manager. This makes it easy to trace events in AIX’s I/O stack. If an I/O is started by the disk driver, this is done via the iostart function in the kernel, the request is forwarded to the adapter driver and then passed to the storage system via the host bus adapter. Handling the response is done by the iodone function in the kernel. The I/O Probe Manager supports (among others) probe events at these locations:

@@io:disk:iostart:read:<filter>
@@io::disk:iostart:write:<filter>
@@io:disk:iodone:read:<filter>
@@io::disk:iodone:write:<filter>

As a filter, e.g. a hdisk name like hdisk2 can be specified. The probe points then only trigger events for the disk hdisk2. This allows to perform an action whenever an I/O for a hdisk begins or ends. This would allow to measure how long an I/O operation takes or just to count how many I/Os are executed. In our example, we were interested in the utilization of the disk queue, i.e. the number of I/Os sent to the disk which are not yet completed. The I/O Probe Manager has a built-in variable __diskinfo for the iostart and iodone I/O probe events with the following fields (https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/com.ibm.aix .genprogc / probevue_man_io.htm):

name          char*     Name of the disk.
…
queue_depth   int       The queue depth of the disk (value from ODM)
cmds_out      int       Number of outstanding I/Os
…

The cmds_out field indicates how many I/Os have already been sent to the disk for which the I/O has not yet been completed (response has not yet arrived at the server).

The following section of code determines the minimum, maximum, and average number of entries in the disk queue:

@@io:disk:iostart:*:hdisk0     // Only I/Os for hdisk0 are considered
{
   queue = __iopath->cmds_out; // Store number of outstanding I/Os in variable queue
   ++numIO;                    // Number of I/Os (used for calculating the average)
   avg += queue;               // Add number of outstanding I/Os to variable avg
   if ( queue < min )
      min = queue;             // Check if minimum
   if ( queue > max )
      max = queue;             // Check if maximum
}

The calculated values are then printed once per second using the interval probe manager:

@@interval:*:clock:1000
{
   if ( numIO == 0 )
      numIO = 1;    // Prevent division by 0 when calculating the average
   if ( min > max )
      min = max;
   printf( "%5d  %5d  %5d\n" , min , avg/numIO , max );
   min = 100000;   // Reset variables for the next interval
   avg = 0;
   max = 0;
   numIO = 0;
}

The full script is available for download on our website: ioqueue.e.

Here is a sample run of the script for the disk hdisk13:

# ./ioqueue.e hdisk13
  min    avg    max
    1      1      2
    1      1      9
    1      1      2
    1      1      8
    1      1      2
    1      1      2
    1      1      8
    1      1     10
    1      1      2
    1      1      1
    1      1     10
    1      1      2
    1      1     11
...

The script expects an hdisk as an argument, and then outputs once per second the values determined for the specified hdisk.

In the example output you can see that the maximum number of entries in the disk queue is 11. An increase of the attribute queue_depth therefore makes no sense from a performance perspective.

Here’s another example:

# ./ioqueue.e hdisk21
  min    avg    max
    9     15     20
   11     17     20
   15     19     20
   13     19     20
   14     19     20
   17     18     20
   18     18     19
   16     19     20
   13     18     20
   18     19     19
   17     19     20
   18     19     20
   17     19     19
...

In this case, the maximum value of 20 (the hdisk21 has a queue_depth of 20) is reached on a regular basis. Increasing the queue_depth can improve throughput in this case.

Of course, the sample script can be expanded in various ways; to determine the throughput in addition, or the waiting time of I/Os in the wait queue, or even the position and size of each I/O on the disk. This example just shows how easy it is to get information about I/Os using ProbeVue.