One way to gauge the performance of a TCP-IP stack or TCP-IP-based application is to calculate its throughput; that is, how many bits per second can be processed by the device from the physical layer to the application layer. One of the most popular tools to perform throughput tests is IPerf, which has several implementations out there but is in its purest form a cross-platform command-line application that acts as a client or server to transmit or receive a data payload to the remote host. The tool essentially allows the user to specify how much data to send, over which transport, at what time interval and for how long; yielding a detailed report at the end of the test (Figure 1).
- Default interval is 1 second.
- The throughput during this interval is the amount under the “Transfer” column, divided by the interval (in seconds).
- The “Bandwidth” column describes the amount of data that could have been sent.
Figure 1: Typical IPerf results screen.
The problem with calculating throughput this way is that embedded implementations of IPerf might use other protocols such as TELNET or serial to output the intermediary results at each interval which might introduce undesired overhead that (depending on the microprocessor’s performance) may negatively impact the throughput figures.
For this reason, a case could be made to isolate these externalities from the device under test and conduct throughput tests manually and independently, using a combination of Wireshark, WinSock2 or BSD socket programming, and the embedded stack. Wireshark provides a capture summary (by clicking on Statistics -> Capture File Properties on the menu bar) that quickly lists the throughput of a TCP stream and transferred UDP datagrams. However, unlike TCP, the UDP protocol itself has no way to acknowledge the received data back to the sender. If the PC were to act as a client and our embedded device as a server, we can never know how many of those captured frames made it through the TCPIP stack and reached the application layer by using the Wireshark method.
To solve this, we can conceive a µC/OS-III application task (like the
AppUDP_ServerTask() shown in Figure 3) that instantiates a UDP server (
App_UDP_Server() in Figure 4) whose only role is to listen for an incoming connection on a specified port (
UDP_SERVER_PORT/20002) and consume whatever data has been received in chunks determined by
Figure 2: Definitions and declarations for UDP server instance.
Figure 3: UDP server task definition.
Figure 4: UDP server instance.
On the client side, one could simply have a Winsock or BSD UDP client that fires frames down to the microsecond range (or as fast as the hardware will allow) in order to stress-test the device.
Figure 5: Winsock-based UDP client.
Since the scope of this blog only covers Winsock, we need to install Cygwin or MinGW so that we can run the gcc compiler and build the executable using the following command in the Windows terminal or Cygwin:
gcc udp_c.c -o udp_client.exe -std=c99 -lwsock32
and then executing the
udp_client.exe to run the program. If you notice in line 12 of Figure 5, there is a preprocessor #define (
TIMEOUT_uS) that controls how far apart the datagrams are sent out (set to 100 µs) but as stated before, this number can deviate from it due to hardware constraints.
udp_client.exe is running, we first need to run our UDP server application in the embedded target, and then start a capture in Wireshark calculate the throughput. It’s necessary to filter the capture with
ip.addr==192.168.2.20 && !icmp, replacing 192.168.2.20 with the IP address of the embedded target and clicking on the Start Capture button.
Figure 6: Wireshark capture of UDP datagrams sent out by
udp_client.exe is finished sending datagrams and Wireshark does not show any more them incoming, stop the capture by clicking the Stop button and pause the debug session on the embedded target. If you notice in Figure 4, whenever the call to
NetSock_RxDataFrom() returns without error we increment a global variable named
Hit_Rate_Ctr, which is a simple counter that tallies up how many of the sent frames actually made it to the application layer and were not dropped. We can finally calculate the throughput of this capture in Mbps by plugging all the information we’ve obtain until this point into the following formula:
UDP Rx Throughput (Mbps) = Hit_Rate_Ctr * [nbr_of_frames * payload_bytes * 8 bits] / [dur_in_sec * 2^20]
nbr_of_frames is how many frames were captured with the applied filter and transmitted by our
udp_client.exe program (See Figure 6),
payload_bytes is the length of the payload carried by each datagram (or
RX_BUF_SIZE in bytes), and the
dur_in_sec is how much time has elapsed since the first frame was captured, and not how much time has elapsed since the beginning of the capture. To make this value easier to calculate a time reference can be added by right-clicking on the first captured frame and clicking on “Set/Unset Time Reference” in the context menu.