Quality RTOS & Embedded Software

 Real time embedded FreeRTOS RSS feed 
Quick Start Supported MCUs PDF Books Trace Tools Ecosystem TCP & FAT


Memory barriers in FreeRTOS

Posted by orifai01 on February 22, 2017

Hello, Are there equivalent functions to Linux wmb() and rmb() in FreeRTOS?

Thank you

Memory barriers in FreeRTOS

Posted by richarddamon on February 22, 2017

FreeRTOS is built with an assumption of a single processor model (so that a disable interrupt provides proper mutual exclusion to system structures), and so doesn't have aneed at the sofware levl for such a primative.

There still might be a need at the hardware level, but its operation will be hardware dependant and OS agnostic, so it will be up to the compiler to provide such a primative, when it is needed.

For many of the processors that FreeRTOS is targeted, there really is no need fdor such a thing, as they don't have a cache sofisticated enough to delay writes, and thus need the barrier.

Memory barriers in FreeRTOS

Posted by rtel on February 22, 2017

In addition to Mr Damon's reply: FreeRTOS does use memory barriers internally, where necessary (for example after writing to hardware registers to enter sleep mode, etc.), but that is all.

Memory barriers in FreeRTOS

Posted by davidbrown on February 22, 2017

The rmb() and wmb() functions (macros, actually) in Linux are essential in single processor systems too. For simple enough systems, they are defined as:

asm volatile("" ::: "memory");

What this means is that any reads or writes of memory will not be moved across this memory barrier, nor will the results of such reads or writes be cached across the barrier. You typically don't need such barriers in normal C, but if you need control of when data is read or written with respect to other accesses (interrupts, pre-emptive scheduled threads, DMA, etc.) then such barriers are a simple and cheap way to enforce ordering. It is an alternative to making all the memory accesses volatile.

I expect that many FreeRTOS functions include memory barriers already.

Memory barriers in FreeRTOS

Posted by richarddamon on February 23, 2017

Your case for the single processor isn't a OS issue, but a compiler optimization issue. An RTOS will not change the order of execution of actions within a task/thread.

There is NOTHING for the OS to do about this in this case.

Memory barriers in FreeRTOS

Posted by davidbrown on February 23, 2017

That is true, but it is still a reasonable question to ask - since other OS's provide such macros. The OS needs such compiler barriers in its implementation, and already has to have some sort of abstraction for them since the details vary according to compiler - it is perfectly natural to think that the same functionality could be exposed to the user.

Actually, having had a quick look at the port.c and port.h code for critical sections in the ARM_CM4F port (as an example), I can see that the code does not have the correct barriers where it needs them. It relies on using vPortEnterCritical and vPortExitCritical as functions, which will work as a compiler memory barrier as long as the compiler cannot see those function definitions when calling them. If the code is compiled with link-time optimisation, that changes - and the compiler can move memory accesses around calls to vEnterCritical and vExitCritical.

Memory barriers in FreeRTOS

Posted by rtel on February 23, 2017

Actually, having had a quick look at the port.c and port.h code for critical sections in the ARM_CM4F port (as an example), I can see that the code does not have the correct barriers where it needs them.

If you think there is an error somewhere, please be specific as to where, so it can be discussed, and corrected if necessary.

The code used to mask interrupts in the M4 port is shown below. It includes memory barriers. Please let me know your thoughts on why this is incorrect:

portFORCE_INLINE static void vPortRaiseBASEPRI( void )
uint32_t ulNewBASEPRI;

   __asm volatile
     "  mov %0, %1            \n"  \
     "  msr basepri, %0       \n" \
     "  isb                   \n" \
     "  dsb                   \n" \
     :"=r" (ulNewBASEPRI) : "i" ( configMAX_SYSCALL_INTERRUPT_PRIORITY )

Memory barriers in FreeRTOS

Posted by davidbrown on February 27, 2017


It is possible that this has drifted a bit from the original poster's question - if you would prefer to discuss this by ordinary email, that's fine by me. It might also be worth asking on the gcc-help mailing list to get the opinions of the actual compiler developers here.

My concern here rests on three points:

  1. In C, sequence points and other ordering is only "as if" - the generated code must act "as if" it followed the C rules of sequencing, with respect to "observable behaviour". "Observable behaviour" includes volatile memory accesses, file I/O, other I/O, and program start/stop. Calling a function whose source and implementation are not known when compiling a file effectively acts as "observable behaviour" because the compiler does not know if the function has any /real/ observable behaviour. And some compiler extensions, such as "volatile" inline assembly in gcc, are also considered "observable behaviour".

  2. Ordering of other aspects of the language is not required to follow the ordering of the abstract machine. In particular, any reads or writes to memory can be re-arranged with respect to volatile memory accesses - it is only the volatile accesses that are ordered.

  3. When using more advanced compilers and optimisations, the compiler can use knowledge of called functions to re-arrange code. In particular, it can use link-time optimisation to see across different translation units.

The compiler does not understand the contents of assembly statements. It uses the output, input, and clobber sections, along with the "volatile" keyword, to learn about them. When you have inline assembly such as the given vPortRaiseBASEPRI function, the "volatile" forces the assembly to be ordered with respect to other volatile accesses. (The input and output sections can also enforce certain ordering.) But it does /not/ order the code with respect to ordinary memory accesses or other code.

A user may have code like this:

uint64_t x;
uint64_t y;

// Nothing else can interrupt this,
// so the 64-bit accesses will be atomic
x = y;       

This is a natural interpretation of entering or exiting a critical code region. But the code does /not/ guarantee it. We can simplify taskENTER_CRITICAL as:

asm volatile...


if (uxCriticalNesting == 0) {
    asm volatile...

From the definitions of the volatile assembly, the /only/ things the compiler sees as important for order is the ordering with respect to other volatile accesses, and of course it cannot change the order of instructions within the asm volatile statements themselves.

So the user code is actually:

asm volatile... // enter
x = y;
if (uxCriticalNesting == 0) {
    asm volatile... // exit

The compiler can re-arrange /any/ of this with respect to the volatile assembly. It can start by incrementing uxCriticalNesting, then decrement it (it can also omit this entirely). Then it might read the low half of y and write it to the low half of x. Perhaps then it execute code like:

if (uxCriticalNesting) {
    asm volatile... // enter
} else {
    asm volatile... // enter
    asm volatile... // exit

And then it will copy the top half of y into the top half of x.

I think the user would be rather surprised to see this happen - but it is all legal for the compiler.

It is even worse for things like taskYIELD - the user would expect normal writes before a taskYIELD to be completed before a task switch!

Now, it is fair to say that the FreeRTOS code has worked fine so far - I doubt if you have seen real cases of this kind of re-ordering. There are two things that save you here - one is that the compiler cannot normally see "inside" function calls like taskYIELD when they are defined in a different file, and the other is that compilers don't re-arrange code awkwardly just for fun - they only do it if there is a performance gain to be had. But with link-time optimisation making large pieces of code look like a giant inlined function, and processors with lots of registers to keep local data around instead of writing it out to memory - you /will/ see problems. And like all subtle problems in multi-tasking systems, they are going to be seriously unpleasant to find because code will work as expected in most cases.

If you believe me that this is potentially a real problem, even on single cpu systems, then thankfully the fix is extremely simple (in gcc and clang at least - I can't answer for other compilers). Just add a "memory" to the clobber list of your inline assembly, or add explicit barriers:

#define barrier() asm volatile ("" ::: "memory")

The memory clobber tells gcc that this assembly may read or write memory in a way the compiler does not know about (through the input and output parts of the asm statement).

Be generous about adding such barrier() statements in your code. It will keep things a lot safer, and mean that common users' assumptions about things like critical sections will be correct.

(C11 gives other ways to make such fences or enforce such ordering, but that won't help the FreeRTOS code much!)



Memory barriers in FreeRTOS

Posted by rtel on February 27, 2017

Hi David - thanks for taking the time to provide this analysis. There is a bit much there for a quick response but I am digesting the info.

Memory barriers in FreeRTOS

Posted by glenenglish on April 18, 2017

David, good stuff. Has there been any more discussion on this before I do a full code review ?

I avoid -O3 like the plague, and only use it for spot functions. -O3 breaks my FreeRTOS devices, at the time I just guessed an excessivley aggressive compiler generating WILDLY unpexpected reordering, and just stay with -O2. I'll have to investigate more when I have time one day.... glen.

Memory barriers in FreeRTOS

Posted by rtel on April 19, 2017

The head revision in SVN has added :::"memory" to many additional places.

Memory barriers in FreeRTOS

Posted by davidbrown on April 19, 2017

I haven't had a look at the code for a while, but if my post about memory barriers was helpful, then I'm happy.

In theory, baring compiler bugs (which should be rare, but do exist), correct code will not be broken (except possibly timing or space requirements) by changing optimisation options. In practice, I have seen it a great many times. Sometimes this is simply due to programmers not understanding C properly. But sometimes it is due to the limitations of C in expressing the needs of the programmer. And sometimes it is due to the type of code in question being very difficult to get right.

In the case of FreeRTOS, I guess it is a combination of the second two reasons. You can't write an RTOS in pure C - it needs some assembly, and it needs implementation-dependent behaviour and compiler extensions. It is particularly difficult to write code that is as portable as possible in such circumstances. But a generous helping of memory barriers can certainly help!

One thing to remember about optimisation levels for compilers is that they are not commands - they are hints. A compiler is free to use any valid optimisation technique regardless of the command line settings. So if your code works with -O2 and not -O3, then perhaps with the next version of your toolchain it will also break with -O2 or -O1.

Memory barriers in FreeRTOS

Posted by glenenglish on April 19, 2017

Hi David Yeah, I think if something doesnt work under O3 but does on O2, its a case of "you got away with it once". IE strictly speaking,, and under critical analysis you were sloppy. But only very strictly. OK on the commands/hints. I was not aware of that behaviour. But I find tracking down -O3 related optimizer side effects to be quite difficult.

Memory barriers in FreeRTOS

Posted by davidbrown on April 19, 2017

Yes, figuring out what went wrong when you changed optimisation levels can be very difficult. And with more advanced optimisation, there can be a bit of luck involved - seemingly irrelevant changes can lead to the compiler picking a differenent balance for when a function is inlined or code sequencing is re-arranged, and suddenly there is a difference in what works and what does not work. The kind of things that cause such problems are often quite subtle.

Sometimes it is possible to track the issue by manually enabling or disabling different optimisation flags. With gcc, the -O flags mostly control groups of individual optimisation passes, which can be enabled or disabled somewhat independently. (I am being deliberately vague here, because reality is not quite as simple as this suggests.) A bit of trial and error can give you clues as to what might have failed.

Of course, when the error is an accidental race condition that only turns up once in a blue moon - or when the customer is demonstrating the system in front of his boss - you probably just want to disable the higher optimisations for now, and add a "fixme" comment for the future :-)

[ Back to the top ]    [ About FreeRTOS ]    [ Sitemap ]    [ ]

Copyright (C) Amazon Web Services, Inc. or its affiliates. All rights reserved.

Latest News

FreeRTOS kernel V10 is available for immediate download. Now MIT licensed.

FreeRTOS Partners

ARM Connected RTOS partner for all ARM microcontroller cores

IAR Partner

Microchip Premier RTOS Partner

RTOS partner of NXP for all NXP ARM microcontrollers

STMicro RTOS partner supporting ARM7, ARM Cortex-M3, ARM Cortex-M4 and ARM Cortex-M0

Texas Instruments MCU Developer Network RTOS partner for ARM and MSP430 microcontrollers

OpenRTOS and SafeRTOS