Exploring µC/OS-III’s Built-In Performance Measurements – Part II

Micrium's µC/OS-III kernel has a rich set of built-in instrumentation that collects real-time performance data. This data can be used to provide invaluable insight into your kernel based application, allowing you to have a better understanding of the run-time behavior of your system. Having this information readily available can, in some cases, uncover potential real-time programming errors and allow you to optimize your application.

In Part I of this post we examined, via µC/Probe, a number of the statistics built into µC/OS-III, including those for stack usage, CPU usage (total and per-task), context-switch counts, and signaling times for task semaphores and queues.

In this post, we'll examine the kernel’s built-in ability to measure interrupt-disable and scheduler lock time on a per task basis. Once again, we used µC/Probe to display, at run time, these values.

Measuring Interrupt Disable Time

Kernels and application code often need to disable interrupts to manage critical sections of code. In fact, it’s often useful, if not necessary to have an idea of just how much time interrupts are disabled in an application. Micrium added this capability in one of its utility module called µC/CPU which is provided with µC/OS-III. Disabling and enabling interrupts is the fastest and most efficient way to ensure access that code is accessed atomically. However, this must be done with great care as to not overly impact the responsiveness of your application.

Throughout Micrium’s code, you will notice the sequence as shown in Listing 1.

void  MyFunction (void) {
{
    CPU_SR_ALLOC();

    :
    :
    CPU_CRITICAL_ENTER();
    :
    /* Code executed under a critical section */
    :
    CPU_CRITICAL_EXIT();
    :
    :
}

Listing 1, Protecting a Critical Section of Code

The ‘:’ indicates a section of code that executes. The actual sequence is unimportant for this discussion. CPU_SR_ALLOC() is a macro that creates a local variable called cpu_sr. cpu_sr is used to save the current state of the CPU’s interrupt disable flag which is typically found in the CPU’s status register thus the ‘sr’ in the name.

CPU_CRITICAL_ENTER() is also a macro and it’s used to save the state of the CPU interrupt disable flag by placing the current value of the status register in cpu_sr before disabling further interrupts. CPU_CRITICAL_EXIT() simply restores the state of the status register (from the saved cpu_sr).

You enable interrupt disable time measurement by setting a configuration #define called CPU_CFG_INT_DIS_MEAS_EN to DEF_ENABLED. In this case, the macros CPU_CRITICAL_ENTER() and CPU_CRITICAL_EXIT() are each automatically altered to include a call to a function. CPU_CRITICAL_ENTER() will call CPU_IntDisMeasStart() immediately upon disabling interrupts and CPU_CRITICAL_EXIT() will call CPU_IntDisMeasStop() just before re-enabling interrupts. Both of these functions are presented in Listing 2 and 3, respectively.

#ifdef  CPU_CFG_INT_DIS_MEAS_EN
void  CPU_IntDisMeasStart (void)
{
    CPU_IntDisMeasCtr++;                                           //  (1)
    if (CPU_IntDisNestCtr == 0u) {                                 //  (2)
        CPU_IntDisMeasStart_cnts = CPU_TS_TmrRd();                 //  (3)       
    }
    CPU_IntDisNestCtr++;                                           //  (4)
}
#endif

Listing 2, Interrupt Disable Time Measurement – Start Measurement Function

  1. Here we keep track of how often interrupts are disabled. This value is not actually involved in the interrupt disable time measurement.
  2. A counter is used to track nesting of CPU_CRITICAL_ENTER() calls. In practice, however, it’s rare, if ever, that we actually nest this macro. However, the code is included in case your application needs to do this. That being said, it’s not recommended that you do.

Here we read the value of a free-running counter by calling CPU_TS_TmrRd(). If you enable interrupt disable time measurements, you (or the implementer of the port for the CPU you are using) will need to setup a free-running counter that would ideally be a 32 bit up counter that is clocked at the same rate as the CPU. The reason a 32-bit counter is preferable is because we use Time Stamping (thus the ‘TS’ acronym) elsewhere for delta time measurements and a 32-bit counter avoids having to account for overflows when measuring relatively long times. Of course, you can also use a 16-bit counter but most likely it would be clocked at a slower rate to avoid overflows. The value read from the timer is stored in a global variable (global to the µC/CPU module) called CPU_IntDisMeasStart_cnts.

  1. We then increment the nesting counter.
#ifdef  CPU_CFG_INT_DIS_MEAS_EN
void  CPU_IntDisMeasStop (void)
{
    CPU_TS_TMR  time_ints_disd_cnts;                               //  (5)

    CPU_IntDisNestCtr--;                                           //  (6)
    if (CPU_IntDisNestCtr == 0u) {                                 
        CPU_IntDisMeasStop_cnts = CPU_TS_TmrRd();                  
        time_ints_disd_cnts     = CPU_IntDisMeasStop_cnts -        //  (7)
                                  CPU_IntDisMeasStart_cnts;
        if (CPU_IntDisMeasMaxCur_cnts < time_ints_disd_cnts) {     //  (8)
            CPU_IntDisMeasMaxCur_cnts = time_ints_disd_cnts;
        }
        if (CPU_IntDisMeasMax_cnts    < time_ints_disd_cnts) {     
            CPU_IntDisMeasMax_cnts    = time_ints_disd_cnts;
        }
    }
}
#endif

Listing 3, Interrupt Disable Time Measurement – End Measurement Function

  1. A local variable of data type CPU_TS_TMR is allocated and is used in the delta-time measurement. This variable typically matches the word width of the free-running counter and is always an unsigned integer.
  2. We decrement the nesting counter and, if this is the last nested level, we proceed with reading the free-running counter in order to obtain a time-stamp of the end of interrupt disable time.
  3. The time difference is computed by subtracting the start time from the end time (i.e., stop time). The calculation always yields the proper delta time because the free-running counter is an up counter and, we used unsigned math.
  4. We keep track of two separate interrupt disable time peak detectors. One peak detector is used to determine the maximum interrupt disable time of each task (i.e., CPU_IntDisMeasMaxCur_cnts) and is reset during each context switch. code>CPU_IntDisMeasMax_cnts is the global maximum interrupt disable time and is never reset.

Interrupt disable time is measured in free-running counter counts. To get the execution time in µs, you would need to know the clocking rate of the free-running counter. For example, if you get 1000 counts and the counter is clocked at 100 MHz then the interrupt disable time would be 10 µs.

Figure 1 – Per-Task CPU Interrupt Disable Times

Figure 1 – Per-Task CPU Interrupt Disable Times

As you’d expect, the code to measure interrupt disable time adds measurement artifacts and should be accounted for. This overhead is found in the variable CPU_IntDisMeasOvrhd_cnts and is determined during initialization. However, the overhead is NOT accounted for in the code shown above. CPU_IntDisMeasOvrhd_cnts also represents free running counter counts. Depending on the CPU architecture, the measurement overhead (i.e., calling CPU_IntDisMeasStart() and CPU_IntDisMeasStop()) consists of typically between 50 and 75 CPU instructions.

µC/Probe is able to display the interrupt disable time on a per-task basis for µC/OS III when you select to display the kernel awareness capabilities. Also, µC/Probe takes care of the conversion of counts to µs and thus, all values are displayed in µs as shown in Figure 1.

Each row represents the value of a task. Not shown in the figure is that there is a name associated with each task and thus you can of course know which time is associated with what task. The highlighted task clearly shows that one task disables interrupts longer than the average. This could be a problem or, it might be within expected limits. Something the developer might need to investigate.

The per-task interrupt disable time came in handy recently as it helped us discover that a driver was disabling interrupts for over 2,500 microseconds! This value popped up like a sore thumb so we were able to quickly identify and correct the issue. Without this measurement, I’m not sure we would have been able to identify and correct this problem as quickly as we did.

Per-task Scheduler Lock Time

Figure 2 – Per-Task Scheduler Lock Time

Figure 2 – Per-Task Scheduler Lock Time

In order to avoid long interrupt disable times, µC/OS III locks, and allows you to lock the scheduler. This has the same effect as temporarily making the current task the highest priority task. However, while the scheduler is locked, interrupts are still accepted and processed.

As shown in Figure 2, µC/OS III measures the maximum scheduler lock time on a per task basis. Again, not shown in the figure is that there is a name associated with each task allowing you to know which time is associated with what task.

The values are displayed in microseconds and for most tasks shown in this example, the scheduler never gets locked. However, µC/OS III’s timer task locks the scheduler for close to 15 microseconds. In my example, I created 20 soft-timers and locking the scheduler is necessary to ensure that timers are treated atomically when they are updated.

As a general rule, you should avoid locking the scheduler in your application but, if you do so, try to lock it for shorter periods than what µC/OS III does.

Conclusion

In this post we examined, on a per task basis, statistics built into µC/OS-III for measuring interrupt disable time as well as scheduler lock time. These metrics can be invaluable in determining whether your embedded application satisfies some of its requirements. With µC/Probe, you get to see performance metrics and kernel status information live.

Tags: , , ,

Questions or Comments?

Have a question or a suggestion for a future article?
Don't hesitate to contact us and let us know!
All comments and ideas are welcome.