arm – nested irqs

I’m evaluating FreeRTOS for use in an ARM environment, it does not seem to be able to support nested IRQs, and does not support a priority based interrupt service routine. Which, in my definition is fundamental to a RTOS. Perhaps I do not see something, could you please point it out? The problem will not occur unless and until: (a) you depend on the interrupt controller to prioritize IRQs (b) and you re-enable IRQs within an IRQ. here’s the short problem description: freeRTOS, at startup runs in SVC mode, and switches to SYS mode at the first task. The ARM – unlike other CPUs does not automatically push anything on the stack. This is also true of IRQs, ARM instead switches to IRQ mode. The current CPSR is saved, and the return address+4 is put into the IRQ-R14 (LR) register. If a second higher priority IRQ occurs, IRQ-R14 is overwritten. Yes, one could make this work – writing hand crafted IRQ routines in ASM but it is not possible to _reasonably_ write any IRQs in C using this approach. -—————- Here’s the scenario where I see it failing: 0) Initial state: No irqs pending, in "system mode" tasks running. 1) IRQ occurs. 2) CPU switches to IRQ-MODE,      IRQ-R14 holds return address.      SPSR holds saved PSR value. 3) Code saves some registers, and for example     does "sub lr,lr,#4" 4) — NOTE — Nowhere do I see the code leave    ARM_ISR mode that is the fundamental problem! 5) ISR routines are typically written in C these days.    Example: Demo/ARM7_LPC2106_GCC/serial/serialISR.c    They often call other C functions in the standard way. 6) Remember: the chip has a priority based IRQ controller.    Assume that a low-priority IRQ has just occured.    And – another HIGH PRIORITY irq is about to occur.    To "nest" interrupts you:    (a) allow the IRQ controller to manage priorities    (b) you then just re-enable IRQs in the CPU early in the IRQ routine.    ie: Often you handle the critical stuff as fast as possible    then re-enable IRQs, and take care of the time consuming    part later, but as part of the same interrupt.    7) During normal function calls the CPU, and the compiler    make use of R14(lr) as the return address.    But remember this question: Which R14 is it? I direct    your attention to the ARM_ARM book 2.3 Registers.    Remember: ‘hidden’ functions also include the    compiler run time "divide" or "mod" routines. 8) In this example – say your C code was dividing    something, and was in the middle of the ‘divide’    helper compiler routine.    Since the CPU is in IRQ mode, R14/LR => IRQ_R14.    [The same can be said about R13/SP]    The contents of the LR register (depending on the    optimization level, type of function, etc) holds    the return address back to your code, or is often    used as an extra temp-register in between calls. 9) The plan here is: "nested interrupts".    The IRQ has not been ACKed in the IRQ controller.    But, the CPU IRQ is re-enabled. 10) The higher priority IRQ occurs.     PROBLEM: remember: The CPU is still in IRQ MODE     R14 holds the ‘return address from divide’     [or something else, R13/SP is a problem also] 11) PROBLEM: the current "r14" – is return address     to your code from "divide_helper" 12) CPU takes the IRQ – and you just lost the     return address from "divide" Am I missing something? Or does FreeRTOS – not support any form of interrupt priority and/or nested interrupts. ==== What I’m looking for, and cannot seem to find, is the MSR instruction during any ISR handling that shows the CPU switching to some other mode of operation.     

arm – nested irqs

> I’m evaluating FreeRTOS for use in an ARM environment, > it does not seem to be able to support nested IRQs, and does not support > a priority based interrupt service routine. Which, in my definition is > fundamental to a RTOS. > > Perhaps I do not see something, could you please point it out? The code as per the FreeRTOS.org download does not implement nesting of interrupts for the ARM – I have an example somewhere that implements it, but have never had the need to use it in anger. The idea is to process interrupts at the task level – within handler tasks.  The ISR grabs the data from the interrupting peripheral, passes the data to the handler task, clears the interrupt, and nothing else.  This is therefore very fast negating the need for interrupt nesting.  The handler task will generally be created at a priority higher than the application tasks, and, as the handler is woken from within the ISR, execution passes directly from the exiting ISR to the handler task (contiguous in time as if it were all done in the ISR).  The processing then occurs at the task level with interrupts enabled.  Handler tasks can be assigned a priority to reflect their relative priority in the system proving a more flexible approach than many hardware interrupt prioritisation systems (some architectures, not others).  I believe from reading the FAQ that this is what the ARM RTOS guys suggest also. The FreeRTOS.org demos include two methods of entering interrupt routines.  The first is to save and restore the context on entry the the ISR function – with each ISR vectored directly from the interrupt controller.  The second is to save and restore the context from a single common interrupt entry point – then call the relevant handler having inspected the interrupt controller to see which handler to execute.  To nest interrupts the second method is required so you can switch to user/system mode prior to calling the interrupt function itself.  Both methods have pros and cons (what doesn’t).  The two biggest problems with the latter method is 1) non deterministic stack usage, 2) saving/restoring the context even when you don’t need to removing any performance gains you might expect from nesting interrupts.  It pro is the simplicity to code and reduced code size. > 5) ISR routines are typically written in C these days. >   Example: Demo/ARM7_LPC2106_GCC/serial/serialISR.c >   They often call other C functions in the standard way. Take a look at the STR9 demo for the alternative method mentioned above (this does not do the mode switching). >   To "nest" interrupts you: >   (a) allow the IRQ controller to manage priorities >   (b) you then just re-enable IRQs in the CPU early in the IRQ routine. You must switch to user/system mode if you want to re-enable interrupts. >   ie: Often you handle the critical stuff as fast as possible >   then re-enable IRQs, and take care of the time consuming >   part later, but as part of the same interrupt. I would disagree.  Without a kernel I might be forced to do it in the interrupt.  With a kernel I would not do any consuming in an interrupt.   Not to say one way is more correct than another. > > 8) In this example – say your C code was dividing >   something, and was in the middle of the ‘divide’ >   helper compiler routine. I would not recommend performing a division in an ISR, but am happy for others to disagree ;-) Regards.

arm – nested irqs

Thanks for the quick answers. >> The idea is to process interrupts at the task level – within handler tasks. [snip] Thanks – I had forgot the approach where you push ‘long things’ to a task. Although – that can make other issues (ie: Task restarts/context-switch/etc can become burdensome, I’ve been there, >> Not to say one way is more correct than another. Very true! And – that can cause you to have many more tasks.. and task2task communications issues.. ugh… There are many ways to skin a cat, and that cat is tired of getting skinned. -————————————————- >> 8) (divide in an ISR) >  I would not recommend performing a division in an ISR I’m sure you understood that to be an example. There are others, for example pulling data out of a FIFO and put it somewhere else. [ie: ATMEL AT91RM9200 has no DMA for the USB device] or being forced to memcpy() data because for some reason one cannot implement "zero-copy". -————————————————- ?Hm This raises a few other questions. Does each task get a fixed stack? Y/N? Where I’m going with this is as follows:    Say I need 15 tasks, and my worse-case task requires a deep stack.    Does that mean: 15 tasks * sizeof(deepest_stack) = HUGE memory? Some RTOSes- give you a fixed pool of stacks, and tasks borrow them. [SMX does this, or did this the last time I used it years and years ago] Under SMX, you effectively impliment tasks 2 ways, (a) like normal code [which cannot release its stack] or (b) much like a DUFF machine [ie: Like Adam Dunkel’s ProtoThreads] thus, you have more available stacks. I believe the SMX calls where:     que_receive( que_ptr ) – retained stack context and locals    que_receive_stop( que_ptr ) – you loose the stack during the stop. understand: If the "que" was not empty, the task would restart immedatly. The problem/downfall is – if no stack is available the task cannot run. Solution: more stacks, better stack free up, or dedicate a stack. Advantage of a dedicated stack: You never have to wait, and it can be as big or as small as you wish… Disadvantage: ProtoThreads – the "STOP/WAIT" must occur in the original function, and not within a sub-function. Advantage: SMX – this could happen deep in the bowels of various function calls..  I had a tough time explaining how that worked to some people. ================= So – for FreeRTOS – Do the stacks come out of a pool and are they reusable? And can you have stacks of different sizes? And can you bind a stack to a task? (Don’t answer, unless you want to so others can read this) (I can get this my self by reading the code) -Duane.

arm – nested irqs

> ?Hm This raises a few other questions. > > Does each task get a fixed stack? Y/N? Yes – the size of the stack is one of the parameters to the function that creates the task. > Where I’m going with this is as follows: >   Say I need 15 tasks, and my worse-case task requires a deep stack. >   Does that mean: 15 tasks * sizeof(deepest_stack) = HUGE memory? No – each task can have a different size stack. > Some RTOSes- give you a fixed pool of stacks, and tasks borrow them. > [SMX does this, or did this the last time I used it years and years ago] > > Under SMX, you effectively impliment tasks 2 ways, (a) like normal code > [which cannot release its stack] or (b) much like a DUFF machine [ie: Like > Adam Dunkel’s ProtoThreads] thus, you have more available stacks. > > I believe the SMX calls where:  >   que_receive( que_ptr ) – retained stack context and locals >   que_receive_stop( que_ptr ) – you loose the stack during the stop. > > understand: If the "que" was not empty, the task would restart immedatly. > > The problem/downfall is – if no stack is available the task cannot run. > Solution: more stacks, better stack free up, or dedicate a stack. > > Advantage of a dedicated stack: You never have to wait, and it can be > as big or as small as you wish… The stacks are dedicated in FreeRTOS.org. > > Disadvantage: ProtoThreads – the "STOP/WAIT" must occur in the original > function, and not within a sub-function. The co-routine functionaltiy in FreeRTOS.org is similar to the proto thread idea.  It is intended mainly for 8/16 bit processors with little RAM.  One stack is shared between all co-routines.  It is possible to run co-routines in the idle task and in so doing mix co-routines and tasks. > > So – for FreeRTOS – > > Do the stacks come out of a pool and are they reusable? Each task has a dedicated stack.  It always uses the same one and is it is not shared. > And can you have stacks of different sizes? Yes. > And can you bind a stack to a task? As above. > > (Don’t answer, unless you want to so others can read this) > (I can get this my self by reading the code) Yes ;-) Regards.

arm – nested irqs

nobody wrote: > richardbarry wrote: > > The idea is to process interrupts at the task level – within > > handler tasks. > > Thanks – I had forgot the approach where you push ‘long things’ to a > task.  Although – that can make other issues (ie: Task > restarts/context-switch/etc can become burdensome, I’ve been there, I can see your point that the "interrupt at the task level" can add a little more overhead, interfering with something such as received data that needs serviced often. Even though the ISR can waken the highest priority task before returning, this portable software approach would never be as fast as the hardware-accelerated multilevel ISR. Still, 1. I think it’s mostly a comfort issue–I have been worried before, but never have actually needed it faster because it worked just by having the ISR wake/run a task before even returning from the ISR. 2. Is it easy for FreeRTOS to support multilevels without interfering with MCU such as my Freescale targets which don’t have multiple ISR levels? It may be my inexperience, but I think that it be no harder to implement it yourself in just your FreeRTOS application as it would be for you to implement it in your app without FreeRTOS. I’ve done a bit of tracing through this FreeRTOS kernel and feel pretty good about it’s versatility/hackability.

arm – nested irqs

Hi – I am just in the process of porting FreeRTOS to ARM9 AT91SAM9261 and stumbled across this very (to me) relevant thread. There are some significant benefits to using the prioritised hardware interrupt controller, e.g. the AIC on the AT91 series. Logically, anything you can do with nested prioritised hardware interrupts you can also do with nested prioritised tasks and unnested interrupts.  However, it’s always going to be perform worse (more latency, more jitter, less throughput).  Now I would be the first person to say that if you are only using 10% of your CPU and your shortest hard deadline is 100ms then you might as well use the easy unnested interrupt solution. But if you are using 95% of your CPU and have deadlines that are just 5us away, you cannot afford to go that way. Also, you have two classes of ISR – those that must synchronise with a task (sem/queue) and those that just need to run to service hardware but don’t synchronise with tasks (perhaps they just update shared memory, for example).  Often ISRs in the latter category have the shortest deadlines. Coarsely, you can assume that if your shortest hard deadline for reacting to an IRQ is <= 10xcost(context switch) you should really be thinking about using prioritised nested interrupts rather than tasks. As a compromise if all your short deadlines can be serviced by non-synchronising ISRs, you could use the FIQ context to implement those.  But there will be cases where FIQ is in use for other functions or where you must synchronise and so cannot use FIQ (no path from FIQ to FreeRTOS at present). So let’s assume it’s worthwhile to support IRQ nesting for at least some of the community, what does it take to do it? (1) The CPSR_irq.I bit is cleared as soon as possible after an IRQ is taken. (2) The CPSR_swi.I bit is cleared as soon as possible after an SWI is taken, i.e. before it considers whether a new task should be scheduled. (3) Non-synchronising ISRs can still participate in the nesting scheme and can save and restore only that minimum state that they will corrupt. (4) Synchronising ISRs do not need to worry about re-entrancy of the sanctioned FreeRTOS ISR API calls. (5) Use the available vectoring hardware to prioritise interrupts and generate the vector of the ISR to take. (6) Make sure that the enabling of IRQ is in control of the ISR, not generic – ISRs sometimes have other stuff that must be handled first, and if you don’t enable it, you revert to normal non-nested ISR configuration. First off, you need the right hardware – the AT91SAM926x has a Advanced Interrupt Controller (AIC), which is also present on other AT91 ARM designs I think.  I think the LPCxxxx parts have a VIC instead which does much the same thing.  If all you have is a single IRQ line (i.e. a raw ARM chip), then you need to write a s/w emulation of something like the AIC and run it as a preamble to the IRQ handler – I am not going to get into that here! The challenges in supporting nested interrupts are multiple interrupted contexts and switching between them in a well ordered way: Task (mode User or System, R0-R12, SP_user, LR_user, CPSR_user) – this is where the current active task has its state (we don’t care about anything but the active task, obviously). taskYIELD (mode Supervisor, R0-R12, SP_svc, LR_svc, SPSR_svc and CPSR_svc) – you jump onto this one when you use SWI to request a taskYIELD and maybe this could be used for other SWI accessed functions too one day. IRQ (mode IRQ, R0-R12, SP_irq, LR_irq, SPSR_irq and CPSR_irq) – you jump this one whenever an IRQ is taken. Note: IRQ can interrupt either SVC or User/System modes or IRQ modes.  When IRQ interrupts IRQ there is obviously only one SP_irq, LR_irq, SPSR_irq and CPSR_irq, so care needs to be taken to make sure these are shared properly. A global variable xExceptionNesting will be initialised to zero. Exception entry will be: if(xExceptionNesting++ == 0) {   // Outer most exception. * push LR_exception, R0-R12, SP_user, LR_user, SPSR_exception (==CPSR_user) and ulCriticalNesting count onto the task stack. * save the resulting top of stack pointer to the current TCB. * clear the global xTaskWokenByException } else { // Nested exception. * push LR_exception, SPSR_exception to exception stack. * push any of R0-R12 that the exception handler will clobber to the exception stack. } * run any ISR-specific code that must be run with interrupts off. * clear CPSR_exception.I flag to enable nested exceptions. * runs any ISR-specific code that can be run with interrupts on. // The global xTaskWokenByException gets updated by calls to *FromISR() automatically, // and these are either rewritten to be re-entrant or to protect critical sections with // disable/enable IRQ code. * set CPSR_exception.I flag to disable nested exceptions. if(–xExceptionNesting != 0) {   // Nested exception exit * pop any of R0-R12 that the exception handler saved from the exception stack. * pop SPSR_exception and LR_exception from the exception stack. } else {   // Outer most exception exit * examine the global xTaskWokenByException and if set, call vTaskSwitchContext to update the current TCB. * recover the top of stack pointer from the current TCB. * recover LR_exception, R0-R12, SP_user, LR_user, SPSR_exception (==CPSR_user) and ulCriticalNesting count from the task stack. } * return from exception (to task or outer exception) using appropriate manipulation of LR_exception. Comments? Robin Iddon EDESIX Limited

arm – nested irqs

I should have pointed out that in the case of the SWI taskYIELD path, xTaskWokenByException will be set to true initially, as the whole point of this SWI is to cause a vTaskSwitchContext call. Robin

arm – nested irqs

some comments related but not directly.  the fiq interrupt can be used if you want fast interrupts so long as you dont call api functions.  you may need to change the critical nesting macros so they only disable irq not fiq.  the m3 port in the download has nesting of interrupts. it would be very good if the conclusion of this thread would be an example that is included in the source code for download.  it is only the arm ports that has the complexity because of the multiple stacks and method of returning.

arm – nested irqs

Yes, I think a config option is called for to say whether FIQ should be left outside of RTOS control – IMHO, FIQ is mostly there for servicing stuff that really doesn’t want to be managed by the RTOS and should just be left out of it all together – but others will want it included, so make it optional. I plan on submitting all the work back to the project, hopefully for inclusion!  In any case I will make whatever changes I make available along with some demo that exploits them. Robin

arm – nested irqs

> So let’s assume it’s worthwhile to support IRQ nesting for at least some of > the community, what does it take to do it? > > (1) The CPSR_irq.I bit is cleared as soon as possible after an IRQ is taken. > (2) The CPSR_swi.I bit is cleared as soon as possible after an SWI is taken, > i.e. before it considers whether a new task should be scheduled. > (3) Non-synchronising ISRs can still participate in the nesting scheme and can > save and restore only that minimum state that they will corrupt. > (4) Synchronising ISRs do not need to worry about re-entrancy of the sanctioned > FreeRTOS ISR API calls. > (5) Use the available vectoring hardware to prioritise interrupts and generate > the vector of the ISR to take. > (6) Make sure that the enabling of IRQ is in control of the ISR, not generic > – ISRs sometimes have other stuff that must be handled first, and if you don’t > enable it, you revert to normal non-nested ISR configuration. These are the optimal requirements. > First off, you need the right hardware – the AT91SAM926x has a Advanced Interrupt > Controller (AIC), which is also present on other AT91 ARM designs I think. > I think the LPCxxxx parts have a VIC instead which does much the same thing. > If all you have is a single IRQ line (i.e. a raw ARM chip), then you need to > write a s/w emulation of something like the AIC and run it as a preamble to > the IRQ handler – I am not going to get into that here! > > The challenges in supporting nested interrupts are multiple interrupted contexts > and switching between them in a well ordered way: > > Task (mode User or System, R0-R12, SP_user, LR_user, CPSR_user) – this is where > the current active task has its state (we don’t care about anything but the > active task, obviously). > > taskYIELD (mode Supervisor, R0-R12, SP_svc, LR_svc, SPSR_svc and CPSR_svc) – > you jump onto this one when you use SWI to request a taskYIELD and maybe this > could be used for other SWI accessed functions too one day. > > IRQ (mode IRQ, R0-R12, SP_irq, LR_irq, SPSR_irq and CPSR_irq) – you jump this > one whenever an IRQ is taken. > > Note: IRQ can interrupt either SVC or User/System modes or IRQ modes.  When > IRQ interrupts IRQ there is obviously only one SP_irq, LR_irq, SPSR_irq and > CPSR_irq, so care needs to be taken to make sure these are shared properly. > > A global variable xExceptionNesting will be initialised to zero. With you so far. > > Exception entry will be: > > if(xExceptionNesting++ == 0) > { [at least a partial context save would be required to be able to perform the test without corrupting the context?] >  // Outer most exception. > * push LR_exception, R0-R12, SP_user, LR_user, SPSR_exception (==CPSR_user) > and ulCriticalNesting count onto the task stack. > * save the resulting top of stack pointer to the current TCB. > * clear the global xTaskWokenByException [so the entire context is only saved for the first nesting depth, correct?  This is more efficient than the methods I have used in the past where the entire context is saved each time.] > } > else > { > // Nested exception. > * push LR_exception, SPSR_exception to exception stack. > * push any of R0-R12 that the exception handler will clobber to the exception > stack. > } > > * run any ISR-specific code that must be run with interrupts off. > > * clear CPSR_exception.I flag to enable nested exceptions. > [No switch back into system mode.] > * runs any ISR-specific code that can be run with interrupts on. > > // The global xTaskWokenByException gets updated by calls to *FromISR() > automatically, > // and these are either rewritten to be re-entrant or to protect critical sections > with > // disable/enable IRQ code. > > > * set CPSR_exception.I flag to disable nested exceptions. > > if(–xExceptionNesting != 0) > { >  // Nested exception exit > * pop any of R0-R12 that the exception handler saved from the exception stack. > * pop SPSR_exception and LR_exception from the exception stack. > } > else > { >  // Outer most exception exit > * examine the global xTaskWokenByException and if set, call vTaskSwitchContext > to update the current TCB. > * recover the top of stack pointer from the current TCB. > * recover LR_exception, R0-R12, SP_user, LR_user, SPSR_exception (==CPSR_user) > and ulCriticalNesting count from the task stack. > } > > * return from exception (to task or outer exception) using appropriate manipulation > of LR_exception. > > Comments? Your approach then is a little different, saving the banked registers to the stack rather than switching back to a non privileged mode.  Would you intend this code to be executed per interrupt (as per the LPC2000 demo where each ISR calls portSAVE/RESTORE_CONTEXT()) – or have a single entry point, as per the STR9 demo.  The latter might be more simple as you would not have to worry about how the function that implements the ISR returns, it can be a simple function call. Regards.

arm – nested irqs

See http://www.nxp.com/acrobat_download/applicationnotes/AN10381_1.pdf Regards.

arm – nested irqs

Before re-inventing the wheel, here are some resources relevant to this discussion: 1. The NXP (former Philips) application note “Nesting of Interrupts in the LPC2000” (http://www.nxp.com/acrobat_download/applicationnotes/AN10381_1.pdf) shows how to write ISRs that are capable of nesting in C with some admixture of inline assembly. This approach works for IRQs only. Problems might arise if you allow both FIQ and IRQ to preempt each other. 2. The ARM Technical Support Note “Writing Interrupt Handlers” shows how to write IRQs that are capable of nesting. The ARM approach shows an assembler “wrapper” that calls the ISR in C. Again, this particular implementation works only for IRQ and can be problematic for FIQ preempting IRQ. 3. The Micrium Application Note AN1011 (http://www.micrium.com/arm) describes the uC/OS-II port to the ARM. The most interesting part of this AN is the assembler part of the port, which provides both IRQ and FIQ that can nest. It also provides quite a generic context switch for ARM. 4. The Quantum Leaps Application Note “QP and ARM” (http://www.quantum-leaps.com/downloads/qdk.htm#ARM) shows how to implement fully-preemptive single-stack kernel on the ARM. Such kernel is much smaller and faster than any traditional multiple-stack kernel, such as uC/OS-II, or FreeRTOS. For event-driven systems (i.e., most embedded systems) the limitations of the single-stack kernel don’t matter. Please see the ESD article “Build a Super Simple Tasker” (http://www.embedded.com/showArticle.jhtml?articleID=190302110) for more information.   Miro