Quality RTOS & Embedded Software

 Real time embedded FreeRTOS RSS feed 
Real time embedded FreeRTOS mailing list 
Quick Start Supported MCUs PDF Books Trace Tools Ecosystem TCP & FAT Training




Loading

CortexM3 FreeRTOS Oddity

Posted by sniperears on March 7, 2013
As I had the pleasure of fighting this recently, I thought I'd post it in case it might help someone else:

The CM3 port implementation for Yield(FromISR) is as follows:

void vPortYieldFromISR( void )
{
/* Set a PendSV to request a context switch. */
*(portNVIC_INT_CTRL) = portNVIC_PENDSVSET;
}


The danger here is that let's say you have a task that is only run once (init or something) and then deletes itself. That task might look something like:

static void initTask( void * pvParameters )
{
// Do important work
...
// work is all done, so time to go away
vTaskDelete(NULL);
}


If your compiler has optimizations turned on, it would probably call vTaskDelete with the Branch instruction instead of the BranchAndLink instruction (because in that stack frame there is nothing to return to -- BTW this happens in IAR with optimizations set to medium or higher).
Since the task is being deleted, the scheduler needs to do a context switch instead of return and therefore uses the vPortYieldFromISR Macro. This is all well and good except for the fact that setting the PendSV bit doesn't actually cause the interrupt, it just pends it and the NVIC decides when it's appropriate. This means that a few more instructions could get executed before the context switch happens. Well, the next instruction after setting the PendSV bit is to return (branch to link register) -- but the link register wasn't set in this case, because the optimizer knew there was nowhere to return to. This causes a fault. If there were just one more instruction after the vTaskDelete, the BL instruction is used, and the fault is avoided even though the code is never executed. Also, if optimizations are set to low or none (at least in IAR) the BL instruction is used regardless.

RE: CortexM3 FreeRTOS Oddity

Posted by Richard on March 8, 2013
Thanks for posting this, it is very interesting, and I am surprised this could happen. I would have to check how many instructions could execute after a PendSV was implemented, I would not have thought there would be any over and above completing any atomic instructions used in the code generated by the compiler to set the PendSV bit.

In the case you highlight, is the vPortYieldFromISR() function being inlined? Otherwise there would be instructions to return from vPortYieldFromISR() between the PendSV bit being set and vTaskDelete() reaching the end of its function.

Regards.

RE: CortexM3 FreeRTOS Oddity

Posted by sniperears on March 8, 2013
Thanks for the reply Richard,
I too was surprised (and quite baffled for a while) by this. To answer your question, Yes. This is what the end of vTaskDelete looks like:

...
taskEXIT_CRITICAL();

/* Force a reschedule if we have just deleted the current task. */
if( xSchedulerRunning != pdFALSE )
{
if( ( void * ) xTaskToDelete == NULL )
{
portYIELD_WITHIN_API();
}
}
}

Here is the chain of defines (aggregated for ease of readability):

#define portYIELD_WITHIN_API portYIELD // Scheduler layer
#define portYIELD() vPortYieldFromISR() // Port layer


So in the end it vTaskDelete() looks like this:

...
taskEXIT_CRITICAL();

/* Force a reschedule if we have just deleted the current task. */
if( xSchedulerRunning != pdFALSE )
{
if( ( void * ) xTaskToDelete == NULL )
{
portNVIC_INT_CTRL_REG = portNVIC_PENDSVSET_BIT;
}
}
}

And so after the PendSV bit is set -- Branch to Link with link register being 0 is attempted (and faults).

Cheers

RE: CortexM3 FreeRTOS Oddity

Posted by Dave on March 8, 2013
Putting to one side for a minute whether something in FreeRTOS needs to change if the setting of pendSV is inlined, does this example not also show a compiler bug? You say the compiler changes the branch instruction because it is not going to return, but then evidently it does try and return. Either the branch without link should jump to code that is completely self contained and never try to get back to where it jumped from or the branch with link should be used. The compiler does not know about any changes in execution paths that are brought about by FreeRTOS (context switches to other tasks), it only knows about execution paths expected by single threading C code.

RE: CortexM3 FreeRTOS Oddity

Posted by sniperears on March 8, 2013
After re-reading my response I realize I was quite unclear. No, vPortYieldFromISR() is not inlined -- it is implemented in port.c, but yes, the compiler is smart enough to call it with Branch instead of BranchAndLink.
Hope that's more clear.

RE: CortexM3 FreeRTOS Oddity

Posted by sniperears on March 8, 2013
Good point Dave. I agree. Nevertheless, the optimizations really do produce this phenomenon -- maybe it just does a blind jump hoping that execution path will change and expects a fault if it does not? Isn't it good when the compilers have hope? :)

Just for the fun of it I did a screen capture of the disassembly (mixed with C) where this problem exists.

The block of code that follows the vTaskDelete is a vApplicationIdleHook which increments a count and calls SysCtlSleep(). Notice the instruction used to call SysCtlSleep(). A blind Branch. And yet the implementation of SysCtlSleep is an assembly instruction WFI followed by a } which is a BX LR.

SysCtlSleep:
0xce46: 0xf004 0xbdbd B.W CPUwfi ; 0x119c4

CPUwfi:
0x119c4: 0xbf30 WFI
}
0x119c6: 0x4770 BX LR

So apparently this happens everywhere.

RE: CortexM3 FreeRTOS Oddity

Posted by sniperears on March 8, 2013
Image is here -- how come you can't edit here?


RE: CortexM3 FreeRTOS Oddity

Posted by Richard Damon on March 8, 2013
What the compiler is doing is a form of "Tail-Recursion" optimization where you can convert a call followed by a return with a jump (so when the function called does its return, it returns to the caller of the 1st function).

The problem here is that task function do not have a proper stack frame set up for returning, so when the vTaskDelete function executes its return, it heads the processor into invalid space.

What is probably the best option would be to protect against this in vTaskDelete by putting some non-optimizable away operation after the Yield to make sure this doesn't happen. Another would be to add a nop in the Yield operation in the port layer, but this adds a cycle to all yields.

RE: CortexM3 FreeRTOS Oddity

Posted by Richard on March 8, 2013
Thank you for your insights richard_damen. In other ports, where it is known that a software interrupt takes a few instructions to execute, or where branch delay slots are required (because the CPU executes the next instruction that is already loaded into the pipeline), I have added NOPs after the yield, or more recently read back from the software interrupt register to force the microcontroller to stall until the value is first written.

I will review this further in the documentation, and if necessary contact some of the ARM guys for their opinion, before deciding what to do, as I'm surprised it is necessary in this case. I will let you know...

Regards.

RE: CortexM3 FreeRTOS Oddity

Posted by Richard Damon on March 8, 2013
If there is only 1 instruction of delay, I would think that guarding vTaskDelete would make more sense, as executing the return instruction should normally be harmless. If the processor might execute farther than that, then the Yield definitely needs to handle the delay.

The cause of the problem is that multiple levels of code have preformed tail optimization, so that the Yield is attempting to directly return to the caller of the task function. The stack frame for the task function was not set up to handle a return, so executing the return cause a fault. This is where adding some code after the Yield in vTaskDelete would help, as then VTaskDelete could not use tail optimization, and Yield would have a valid stack frame to return from.

This is NOT a compiler bug, as the compiler expects that ALL user functions have proper stack frames to return from, so tail optimization is valid. The other solution would be to initialize the stack frame for a task so that doing a return would be valid, perhaps going to a stub that calls vTaskDelete(0).

RE: CortexM3 FreeRTOS Oddity

Posted by Richard on March 9, 2013
Its seems that worst case up to two instructions past the PendSV setting can be set (depending on lots of things including wait states, etc.), and that adding a memory barrier after writing to the PendSV bit would fix this particular case. The question then is, is it better to take the hit of a memory barrier instruction on every yield to fix a corner case when deleting a task. I'm still researching this and haven't concluded anything yet.

Regards.

RE: CortexM3 FreeRTOS Oddity

Posted by mahavirpj on March 10, 2013
Interesting discussion!

I think, I had seen this issue previously but was not aware (or reached to conclusion) that it was "tail call" optimization. Interestingly this was not reproducible with all gcc (Codesourcery arm-none-eabi-gcc) versions and hence was difficult to chase down. Neverthless, as Richard already mentioned we ended up fixing like below,

------------------------------------------
void vPortYieldFromISR( void )
{
/* Set a PendSV to request a context switch. */
*(portNVIC_INT_CTRL) = portNVIC_PENDSVSET;
__asm volatile ("dsb");
}
-----------------------------------------

But I have couple of questions:

1. Is above fix the correct one, or rather optimum one?
2. What would be worst case impact (if any) on performance of system?

Thanks.


RE: CortexM3 FreeRTOS Oddity

Posted by Richard on March 11, 2013
“1. Is above fix the correct one, or rather optimum one?”


Yes - that will fix the issue reported - but I'm not sure yet whether just adding two NOPs in place of the dsb would actually prove more optimal.

Regards.

RE: CortexM3 FreeRTOS Oddity

Posted by Richard on March 12, 2013
I will probably add a DSB followed by an ISB into the yield function. This is apparently very much overkill, but necessary to be within the specification for the Cortex-M architecture for all current and future parts (as some of the characteristics that dictate these things are implementation defined).

Regards.


[ Back to the top ]    [ About FreeRTOS ]    [ Sitemap ]    [ ]




Copyright (C) 2004-2010 Richard Barry. Copyright (C) 2010-2016 Real Time Engineers Ltd.
Any and all data, files, source code, html content and documentation included in the FreeRTOSTM distribution or available on this site are the exclusive property of Real Time Engineers Ltd.. See the files license.txt (included in the distribution) and this copyright notice for more information. FreeRTOSTM and FreeRTOS.orgTM are trade marks of Real Time Engineers Ltd.

Latest News:

FreeRTOS V9.0.0 is now available for download.


Free TCP/IP and file system demos for the RTOS


Sponsored Links

⇓ Now With No Code Size Limit! ⇓
⇑ Free Download Without Registering ⇑


FreeRTOS Partners

ARM Connected RTOS partner for all ARM microcontroller cores

Renesas Electronics Gold Alliance RTOS Partner.jpg

Microchip Premier RTOS Partner

RTOS partner of NXP for all NXP ARM microcontrollers

Atmel RTOS partner supporting ARM Cortex-M3 and AVR32 microcontrollers

STMicro RTOS partner supporting ARM7, ARM Cortex-M3, ARM Cortex-M4 and ARM Cortex-M0

Xilinx Microblaze and Zynq partner

Silicon Labs low power RTOS partner

Altera RTOS partner for Nios II and Cortex-A9 SoC

Freescale Alliance RTOS Member supporting ARM and ColdFire microcontrollers

Infineon ARM Cortex-M microcontrollers

Texas Instruments MCU Developer Network RTOS partner for ARM and MSP430 microcontrollers

Cypress RTOS partner supporting ARM Cortex-M3

Fujitsu RTOS partner supporting ARM Cortex-M3 and FM3

Microsemi (previously Actel) RTOS partner supporting ARM Cortex-M3

Atollic Partner

IAR Partner

Keil ARM Partner

Embedded Artists