[moved]Why are queues so much faster from ISR
Posted by Richard
on May 25, 2011
[This post is an edited version of the original - only the timings have been removed, the rest is in tact. It is not permitted, under the terms of the FreeRTOS license, to publish timing figures. This is because, everybody will take timings in a different way, and mean different things by "the time between x and y happening". Additionally, extremely often (like most of the time), people who are not familiar with the workings of the kernel, or the kernel source code, will not take optimal timing values. All these things together result in large amounts of contradictory and confusing data. Also, in the past, results that are clearly incorrect have been used as misinformation FUD by commercial vendors. I am a strong believer in taking timings yourself if it is that important, if and never believing what anybody else has published about their system. The clause about not publishing performance data is common in the industry, and not just a FreeRTOS condition. Thank you for your understanding. I am happy to respond to your questions below. Of course, the way to get the fastest system, is simply to remove all the functionality!]
I'm currently evaluating the timing within FreeRTOS using an oszilloscope and the IO-pins of my AT91SAM7X (running at 48 MHz).
Assuming there are several tasks, each waiting to receive from it's queue. The Object passed thorugh the queues is a struct with three int-entries.
Within an ISR calling xQueueSendToBackFromISR() followed by portYIELD_FROM_ISR() takes about nn µs. Another nn µs later the task waiting for that queue is running and processing the data.
But when using xQueueSendToBack() from a normal task just this call already takes nn µs. Until the blocked task is running there are even another nn µs.
So the time needed to pass information and context from one place to another ammounts to nn µs when the source is an ISR and nn µs when two normal tasks are involved. I'm not complaining (other RTOS here in the office need a multiple of CLK to achieve such times) but am curious about the why. Driven, of course, by the question: "Why not using the FromISR()-Call also within my normal task when gaining so much speed with that?"
The pre-emptive functions aren't of much importance for my project since everything happens within a fraction of a tick. So are there any downsides about using the faster ISR-functions everywhere?
I was able to gain nn µs by calling portYIELD() after writing to the cue, without any obvious disadvantages. I always thought the non-ISR-function do a neccessary context switch automatically in contrast to the ISR-function, which instead offer the pxHigherPriorityTaskWoken. So why do things get faster this way?
Lots of questions, thanks in advance for your replies.
RE: [moved]Why are queues so much faster from ISR
Posted by Richard
on May 25, 2011
> Hi, I'm currently evaluating the timing within FreeRTOS using an
> oszilloscope and the IO-pins of my AT91SAM7X (running at 48 MHz).
> Assuming there are several tasks, each waiting to receive from it's
> queue. The Object passed thorugh the queues is a struct with three
> int-entries. Within an ISR calling xQueueSendToBackFromISR() followed
> by portYIELD_FROM_ISR() takes about nn s. Another nn s later the task
> waiting for that queue is running and processing the data. But when
> using xQueueSendToBack() from a normal task just this call already
> takes nn s. Until the blocked task is running there are even another
> nn s.
> So the time needed to pass information and context from one place to
> another ammounts to nn s when the source is an ISR and nn s when two
> normal tasks are involved. I'm not complaining (other RTOS here in
> the office need a multiple of CLK to achieve such times) but am
> curious about the why.
Looking at the source code will give you the answer. Basically, sending from an ISR is much leaner code because it has less functionality. It has less functionality because you cannot block on a queue from an ISR. Less code + less functionality = faster execution. Also, you want code executed in an ISR to always be fast.
Using a queue from a task means the queue has to do much, much, more.
One of the things an RTOS should always do is queue tasks that are blocked on the same queue in their priority order. That requires code and execution time. That code and execution overhead is not required when the queue is used from an interrupt.
Also, consider the case of a task being blocked on a queue to wait for data to arrive (it wants to read from a queue, but the queue is empty, so opts to block to wait for the queue not to be empty). If something posts data to the queue, then the task will be unblocked, if it is the highest priority task that was waiting for the data. However, it is possible that, before the unblocked task actually executes (other tasks and interrupt might execute first), the data that was posted to the queue is removed by something else. When that happens, when the unblocked task actually does execute, it will find the queue empty again. It would be wrong for it to return from the queue read function without data if its block time had not expired. Therefore, what it does, is re-calculate its block period to take into account the time it had already spent in the blocked state previously, then re-enter the blocked state to wait for another item to be sent to the queue or the remainder of its block period to expire. All this takes code and time, but code and time that is not required when the queue is used from an interrupt.
If other RTOSes don't exhibit this same behaviour, then their end behaviour will be incorrect. So, FreeRTOS could make its queues faster, by effectively introducing subtle bugs. [the same holds true for sending to a blocked queue].
Probably more detail than was required, but demonstrates the point I think (if you can follow my long winded explanation).
> Driven, of course, by the question: "Why not
> using the FromISR()-Call also within my normal task when gaining so
> much speed with that?"
It is completely legitimate to use "FromISR" queue and semaphore calls from tasks, if the application does not require the additional functionality that is described above. The FromISR versions will not allow the task to block if a sending task finds the queue full, or a receiving task finds the queue empty, for example.
There are actually three ways of sending and receiving from a queue. You have identified two already. The third speeds up the sending and receiving from a task by introducing longer critical sections. I am deprecating that method however, so it is not recommended for new designs. It just shows another example of a compromise, longer critical sections for faster execution speed.
> The pre-emptive functions aren't of much importance for my project
> since everything happens within a fraction of a tick. So are there
> any downsides about using the faster ISR-functions everywhere?
Choose the best function for your application. There is not one answer that is correct for every occasion. Just be aware of the differences between the functions, and select the best fit.
RE: [moved]Why are queues so much faster from ISR
Posted by sven-de
on May 26, 2011
I appologize for violating the terms of license, that was pre-mature, your argumentation about timings is very reasonable. Even bigger thanks though for keeping the post and going into so much detail when replying to it.
As every beginner I need to disassemble new findings and apply them to what I already know. The two scenarios you offer might be irrelevant in many applications (mine being one of them).
The first scenario relies on tasks being blocked when trying to write to an already full queue. So this feature isn't needed when setting xTicksToWait = 0 anyway.
The other scenario eliminates possible pitfalls of what's considered bad design anyway: more than one task receiving from the same queue. (still I can very well understand why you as OS-designer must take this into account)
Since none of these two scenraios are of relevance for my application I'm gonna give the FromISR()-funcs a try.
Having understood your above explanation I still wonder why a call to portYIELD() speeds things up. When the Queue-functions called from a task have to do so much, much more, why do things get quicker when adding even more stuff to do by doing another call? I can only assume that beyond the pure queue-functionality something else happens which is kind of aborted by calling portYIELD(). Or, asked differently, why isn't there no immediate context switch within xQueueSendToBack() when the schedulers knows about another task waiting for this very queue? Or isn't that what's intended to happen?
Copyright (C) Amazon Web Services, Inc. or its affiliates. All rights reserved.