Maximum Throughput for FC - PowerCampus 01

The individual phases of an I/O in a FC-SAN are shown. Then the model is simplified to have only 3 phases. The actual length of time for each phase is shown in a diagram for some FC technologies. The round trip time turns out to be decisive for the duration of an I/O. Based on this, the maximum possible throughput of an individual single I/O is determined and shown graphically for various I/O sizes depending on the round-trip time.

The temporal phases of a write operation, somewhat simplified, are shown in the following picture:

1. Processing delay on the server: queuing, copying of data.
2. Transmission Delay: The time it takes for the data to be brought onto the medium.
3. Propagation Delay from the server port to the switch port.
4. Forwarding delay from the incoming port to the outgoing port on the FC switch.
5. Propagation delay from the switch port to the storage port.
6. Processing delay on the storage: Write operations usually go to the cache and are therefore fast, read operations can go to hard disks and can therefore be slower (especially with mechanical hard disks).
7. Transmission Delay: Time required to bring the response, or confirmation in the case of write operations, onto the medium.
8. Propagation delay from the storage port to the switch port.
9. Forwarding delay from the incoming port to the outgoing port on the FC switch.
10. Propagation Delay from the switch port to the server port.
11. Processing delay on the server: interrupt processing, copying of data.

Data is only placed on the medium in the second phase; no further data is brought onto the medium in the direction to the storage during all other phases. The medium is not used for data transmission in these phases. This means that no throughput is achieved in 10 out of 11 phases!

In the case of write operations, the data to be written is transferred to the medium in phase 2 and an acknowledgment is sent from the storage device (small data frame) in phase 7. During read operations, a small data frame is sent in phase 2 with the request for data and in phase 7 the requested data (possibly a larger amount of data) is sent from the storage.

The transmission delay TD can be calculated relatively easily; it results from the I/O size IOsize and the bandwidth BW of the medium:

TD = IOsize / BW

The following table lists the values for the transmission delay for some I/O sizes and FC technologies.

In order to determine the maximum theoretical throughput for an I/O of a certain size, we make a few simplifying assumptions:

1. The delay in processing on the server (phase 1 and phase 11) is so short that it does not have to be taken into account.
2. We summarize phases 3, 4 and 5 as the propagation delay from server port to storage port, which phase supplies which portion is irrelevant from the server’s point of view.
3. We assume that the delay in processing on the storage (phase 6) is so short that it does not have to be taken into account. When writing, the data is only written to the cache anyway, which is very fast and when reading the data could already be in the cache. (As a rule, however, the delay caused by the processing on the storage is not negligible and also depends on the I/O size.)
4. The response (acknowledge) is very small for a write operation, so we assume that the time to bring the acknowledge to the medium is negligible.
5. For the way back from storage to server, we also combine phases 8, 9 and 10 as a propagation delay from storage port to server port.

A single I/O then only consists of the following 3 phases:

1. Transmission Delay (TD): The time it takes to transfer the data to the medium depends on the size of the I/O (see table above).
2. Propagation Delay: from the server to the storage.
3. Propagation Delay: from storage to server.

For illustration purposes, the time sequence for a 32 KB write I/O for 1G, 8G and 32G FC is shown below. The propagation delay for the figures was assumed to be 0.1 ms, which is an extremely good value. In many cases the propagation delay will be longer.

One can see very clearly from the figures that the characteristics of an I/O with newer FC standards (higher bandwidths) have changed significantly. While with 1G FC the transmission delay dominates, more than 50% of the time span of an I/O, with 8G FC it only has a proportion of approx. 20% and with 32G FC even only a proportion of approx. 2%! The time it takes to bring the data to the medium is almost irrelevant with 32G FC. The duration of an individual I/O is thus mainly determined by the propagation delay.

In order to calculate the maximum possible throughput, we combine the two phases 2 and 3 (propagation delay server-storage and storage-server) to form what is known as the round trip time (RTT). The time duration (Completion Time CT) for a single complete I/O is then:

CT = TD + RTT + additional delays

(Under “additional delays” all delays are summarized, which we have assumed to be non-existent for the sake of simplicity.)

The throughput for a single I/O results from the I/O size (IOsize) divided by the time required for the I/O (CT):

Throughput = IOsize / CT = IOsize / ( TD + RTT + additional delays ) <= IOsize / ( TD + RTT )

The actual throughput can therefore at best be IOsize / (TD + RTT), if there is no delay in processing or forwarding at any point. With a given I/O size, the maximum achievable throughput depends on the transmission delay (TD) and the round-trip time (RTT).

This is shown graphically below for round-trip times up to 2 ms and various FC technologies.

One can clearly see from the graphs: The higher the bandwidth, the faster the possible throughput decreases as the round-trip time increases! While you can still achieve a throughput of up to 80% of the available bandwidth for an I/O of size 256 KB with a round trip time of 0.5 ms with 1G FC, with 32G FC, at best, you can achieve about 12.5% of the bandwidth!

Techniques such as read-ahead or asynchronous I/O are therefore extremely important for high throughput with new FC technologies. A high throughput can only be achieved if further I/Os are applied to the medium during the propagation delay. Applications that do single-threaded direct I/O (bypassing the file system cache), for example, cannot achieve high throughput.

Note: The result of the article is not that the throughput of new FC adapters with higher bandwidth is generally very low, but that a high throughput can only be achieved by sending as many I/Os as possible in parallel, in order to use the transmission medium (fiber optic) permanently. Applications that worked well with earlier generations of FC adapters will not necessarily achieve higher throughput with newer FC adapters.

Back to the AIX article overview