Cortex-A9 port cause FreeRTOS_Undefined exception

I’m using FreeRTOS 10, with Xilinx’s Zynq 7000 Chip. This is running Linux on core 0 and FreeRTOS on core 1. When apply load to core 1, FreeRTOS will eventually crash to the FreeRTOSUndefined exception handler. This is the stack trace: ~~~ MyFreeRTOSApp
Thread #1 57005 (Suspended : Breakpoint)
FreeRTOS
Undefined() at portasmvectors.S:96 0x30000040
0x10101018
~~~ R14_und is 0x10101018, this looks like the contents of the stack (R10) as set in port.c How can i figure out what causes this exception?

Cortex-A9 port cause FreeRTOS_Undefined exception

Do you know which instruction generated the fault (that is, the value of the program counter at the time the fault occurred)? It should be obtainable from within the exception handler. I have example of how to obtain the offending PC for Cortex-M code, but can’t recall how to do it for Cortex-A. Other than that – have you looked through the list of usual suspects here: https://www.freertos.org/FAQHelp.html Pay particular attention to the interrupt priority requirements.

Cortex-A9 port cause FreeRTOS_Undefined exception

Yes, the PC was 0x10101014 at the time the fault occured. This is way outside the valid program space which is 0x30000000 – 0x3800000. I’ve read through the general FAQ and the Cortex-A specific article. 0x10101014 is 1 word above 0x10101010 which is initialized in port.c. If i change 0x10101010 to for instance 0x12301010 in port.c, this will be reflected in the crash, where PC is now 0x12301014.

Cortex-A9 port cause FreeRTOS_Undefined exception

Right – agree that value almost certainly must have come from an initialised register value – looks like it has been used to hold a byte, hence the rest of the register is untouched. This could be a stack issue then, where returning from a function or interrupt, etc. has resulted in the wrong value being popped into the PC (by which I really mean, the address used to pop the PC was wrong as the stack pointer was wrong or stack corrupted). I think the first thing to do is check which task was running at the time, assuming it was a task, not an interrupt. You can do that by adding “(tskTCB*)pxCurrentTCB” to the expressions window in the debugger – that should then decode pxCurrentTCB as a task control block structure that can be expanded to see the task’s name as a string. Alternatively, if you store the handles of the tasks you create, the value of pxCurrentTCB will equal the task’s handle.

Cortex-A9 port cause FreeRTOS_Undefined exception

That was the idle task running. Does that mean it has to be an interrupt?

Cortex-A9 port cause FreeRTOS_Undefined exception

I wouldn’t say it ‘has’ to be an interrupt, but would agree it is very likely to be an interrupt. Unless you application has added any functionality to the idle task through an idle task hook function or a trace macro?

Cortex-A9 port cause FreeRTOS_Undefined exception

No functionality has been added to the idle task. I’ve only installed one interrupt handler on top of what the FreeRTOS port does (the tick handler). The interrupt handler is related to OpenAMP, used to communicate to the other core. I’m not sure how to procede the debugging now, maybe you can give some tips?

Cortex-A9 port cause FreeRTOS_Undefined exception

After some more investigation I found out the stack trace changes when I disable optimizations. The stack trace now becomes this: ~~~ Thread #1 57005 (Suspended : Signal : SIGTRAP:Trace/breakpoint trap)
FreeRTOSUndefined() at portasm_vectors.S:96 0x30000040
ucHeap() at 0x31400994
~~~ I also found that the line causing the problem is in tasks.c. The macro traceTASKCREATE( ) in the function prvAddNewTaskToReadyList() is defined by Tracealyzer, and it contains calls to portSETINTERRUPTMASKFROMISR() and portCLEARINTERRUPTMASKFROM_ISR(). If I remove the portSETINTERRUPTMASKFROMISR() and portCLEARINTERRUPTMASKFROMISR() calls, everything works ok. If I don’t use Tracealyzer, I can mimic the same behaviour by adding ~~~ uint32t irqstatus = portSETINTERRUPTMASKFROMISR(); portCLEARINTERRUPTMASKFROMISR(irqstatus); ~~~ or ~~~ portDISABLEINTERRUPTS(); portENABLEINTERRUPTS(); ~~~ after the traceTASKCREATE() call. It will execute around 1000 -2000 times before it crashes. What can be the issue here?

Cortex-A9 port cause FreeRTOS_Undefined exception

Are you saying the problem is in the trace macro? So if you remove the trace macros altogether (by not defining them, which makes them take their default empty implementation) everything runs ok?

Cortex-A9 port cause FreeRTOS_Undefined exception

Yes, that is correct. Will you consider it to be an issue if the trace macro calls portSETINTERRUPTMASKFROMISR() and portCLEARINTERRUPTMASKFROMISR()?

Cortex-A9 port cause FreeRTOS_Undefined exception

One big thing I see here is that …FROMISR stuff is supposed to be called from inside an ISR, while the traceTASKCREATE isn’t going to be called from an ISR, but from a task context, so the definition in that macro sounds incorrect.

Cortex-A9 port cause FreeRTOS_Undefined exception

If I disable Tracealyzer completely, and instead insert ~~~ portENTERCRITICAL(); portEXITCRITICAL(); ~~~ right after the traceTASKSWITCHEDIN() call, this will result in the same behaviour, the system crashes. It will execute a few thousand times before it crashes. Should the system behave ok when doing this, or is it expected to crash?

Cortex-A9 port cause FreeRTOS_Undefined exception

I would expect that to crash. traceTASKSWITCHEDIN() is executed inside an interrupt – and those macros are not interrupt safe. There are two reasons I would not expect that to work properly: First exiting the critical section could result in interrupts becoming enabled in a part of the code where they should be disabled, and second those macros are using a critical nesting count that is part of a task’s context – each task has its own nesting count so using it in the interrupt (which is not a task) doesn’t make sense – especially if you switch tasks before exiting the critical section. In this case is sounds like there could be an issue in the implementation of the trace macro, which is provided by Percepio.

Cortex-A9 port cause FreeRTOS_Undefined exception

Would you consider it to be an issue if the trace macro calls portSETINTERRUPTMASKFROMISR() and portCLEARINTERRUPTMASKFROMISR()? Because disabling those lines in the trace implementation fixes the problem. I just need to figure out if the problem is on my part or Percepio’s.

Cortex-A9 port cause FreeRTOS_Undefined exception

Not sure – they will enable global interrupts, but leave the interrupt mask in the correct state, and the kernel only really uses the mask. In that particular place (inside the context switch) they could well be necessary.

Cortex-A9 port cause FreeRTOS_Undefined exception

Who can answer this? I’ve replicated the issue on a Xilinx dev board with a “Xilinx lwIP TCP perf” example application, and the latest version of Tracealyzer, so I’m pretty sure my setup is not part of the problem.

Cortex-A9 port cause FreeRTOS_Undefined exception

If I recall correctly our last exchange on this was suggesting an issue in the implementation of the trace macros, which are provided by Percepio – so you could ask Percepio – they generally have response support.

Cortex-A9 port cause FreeRTOS_Undefined exception

Not sure if it is related but I had something similar on the Raspberry PI with preemptive tasking for a very interesting reason. The trace functions are C code and when you call C code on the ARM abi the stack had to be 8 byte aligned even though the normal alignment for a local variable push etc is only 4. This means you can randomly come into the interrupt with the stack in an align 4 position. So check the stack alignment restrictions on your system as this seems to be common with ARM. I had to make sure I aligned the stack up before calling out to c code .. failing to do so would randomly crash some time later. So my Irq handler ended up looked like this ~~~ /* Save the current context */ portSAVE_CONTEXT
/* the stack pointer is 4-byte aligned at all times, but it must be 8-byte aligned  */
/* to call external C code  */
mov r1, sp
and r1, r1, #0x7                                    ;@ Ensure 8-byte stack alignment
sub sp, sp, r1                                      ;@ adjust stack as necessary
push {r1, lr}                                       ;@ Store adjustment and LR_svc

bl irqHandler                                       ;@ Call irqhandler

/* Reverse out 8 byte padding from above */
pop {r1, lr}                                        ;@ Restore LR_svc
add sp, sp, r1                                      ;@ Un-adjust stack

/* restore context which includes a return from interrupt */
portRESTORE_CONTEXT
~~~

Cortex-A9 port cause FreeRTOS_Undefined exception

Was this on the Cortex-A7 or the Cortex-A53 version of the Raspberry Pi? Were you using this with FreeRTOS?