Quality RTOS & Embedded Software

 Real time embedded FreeRTOS RSS feed 
Real time embedded FreeRTOS mailing list 
Quick Start Supported MCUs PDF Books Trace Tools Ecosystem TCP & FAT Training


STM32F4 with FPU

Posted by thomask77 on October 16, 2011

I just got my discovery board, and would like to try out the FPU. Did anyone write a port yet?! Or a time estimate when it will be officially supported?

I just had a quick look at the architecture manual.. It seems that FreeRTOS would have to store the entire state of the FPU, adding at least 32x4 bytes (Are all 32 FPU-registers in use by compilers? Seems to be an awful lot!). Perhaps i'll give it a try myself.

RE: STM32F4 with FPU

Posted by Richard on October 16, 2011
A lot of thought and work has already gone into supporting the Cortex-M4F, but support is not yet officially available. Note that if you have the FPU turned off then the standard Cortex-M3 port will work fine, but having the FPU turned on is much more complex than you might imagine.

The easy option, if you wish to do it yourself, is to set the FPU related registers to save and restore the FPU context automatically on each interrupt. This is horrendously inefficient with the VFP architecture of the M4F, especially when you consider that only a few tasks will ever use the FPU. Only half the context can be saved automatically, so the other have has to be done manually.

Another option is to allow tasks to register themselves as FPU context users, then manually save the FPU context for just those tasks. That is a little more efficient, but will still result in FPU contexts being saved unnecessarily sometimes.

Another extreme is to attempt to use the lazy save mechanism of the FPU (note lazy save is turned on by default). If you do that, then you have an extremely complex problem to implement, and if interrupts use the FPU too (they might if they are doing something like motor control) then there are a dozen corner cases to take care of once interrupts start nesting that are near impossible to test.

Yet another options is to preform a software lazy save.

Etc. Etc.

Also a word of warning - take extreme care to set up your compiler such that it does not randomly use FPU registers as temporary registers in tasks that are not themselves using the FPU. Some do that, unless special non default command line options are used.

Have fun.


RE: STM32F4 with FPU

Posted by thomask77 on November 25, 2011
Hi again!

I think I got my port up and running.. please find it here:


Before I started, I did some performance measurements. As you said, the time for a full FPU state save/restore is quite long. A pair of vpush {s0-31}/vpop {s0-s31} takes around 400ns on my STM32F407 @ 168MHz.

On the other hand, that translates to just ~68 cycles, which is not that bad at all if you consider the overall performance gain of the FPU vs. software emulation.

Still, I don't want to have the performance hit for things like serial-port or motor-control interrupts. So I'll leave the hardware lazy-save mode enabled.

Without an OS switching tasks, the CPU will just do the right thing anyways:

The AAPCS says that s0-s15 are used as scratch registers, so they're automatically (lazy)-saved on exception entry. s16-s31 are saved by the compiler. There is a performance hit of ~200ns for entry/exit if the lazy save is actually triggered. For interrupts without FPU instructions there is no additional overhead.

The only time when all registers must be saved and restored is for a task switch. This will take about 400ns longer than without FPU.

I added the extended stack frame registers to pxPortInitialiseStack, vPortSVCHandler and xPortPendSVHandler. Additionally, vPortSVCHandler marks the stack frame as an extended frame (Bit 4, LR/EXC_RETURN value).

I must warn that the code is _not_ yet fully tested! Use at your own risk!

Have fun,
Thomas Kindler

RE: STM32F4 with FPU

Posted by thomask77 on November 30, 2011

In the meantime, I've improved my port. Actually, it was simpler than I thought.. compared to the normal Cortex-M3 port, very few additions were required.


Here's the README:

This is the second version of my FreeRTOS port for ARM Cortex M4 cores with FPU support.

It does now support both FPU and non-FPU tasks, and tries to only save the necessary registers.

To achieve this, the EXC_RETURN value (stored in the LR register during exceptions, esp. the PendSVCHandler) of a task is saved on it's stack. Only if bit 4 of the EXC_RETURN value indicates an extended stack frame, the FPU registers are saved or restored.

See the ARM architecture manual, B1-653 for more details.

If a task uses the FPU, it will automatically set the CONTROL.FPCA bit. No special user interaction or task registration is required.

This port is also fully compatible with the FPU lazy-save feature (which is enabled by default).

Have fun,
Thomas Kindler

RE: STM32F4 with FPU

Posted by Gregor Lengeling on December 15, 2011
Hi and thanks for your work Thomas!
When I try to use your port.c in an eclipse-yagarto enviroment i run into problems ..
'Building file: ../FreeRTOS/portable/port.c'
'Invoking: ARM Yagarto Windows GCC C Compiler'
arm-none-eabi-gcc -DUSE_STDPERIPH_DRIVER -DUSE_STM32F4_DISCOVERY -DSTM32F4XX -I"E:\INDIGO\YAG-FreeRTOS-123\FreeRTOS\include" -I"E:\INDIGO\YAG-FreeRTOS-123\Libraries\STM32F4xx_StdPeriph_Driver\src" -I"E:\INDIGO\YAG-FreeRTOS-123\FreeRTOS\portable" -I"E:\INDIGO\YAG-FreeRTOS-123\Libraries\CMSIS\Include" -I"E:\INDIGO\YAG-FreeRTOS-123\Libraries\Device\STM32F4xx\Include" -I"E:\INDIGO\YAG-FreeRTOS-123\Libraries\STM32F4xx_StdPeriph_Driver\inc" -I"E:\INDIGO\YAG-FreeRTOS-123\src" -I"E:\INDIGO\YAG-FreeRTOS-123\Utilities" -O0 -Wall -Wa,-adhlns="FreeRTOS/portable/port.o.lst" -c -fmessage-length=0 -MMD -MP -MF"FreeRTOS/portable/port.d" -MT"FreeRTOS/portable/port.d" -mcpu=cortex-m4 -mthumb -g3 -gdwarf-2 -o "FreeRTOS/portable/port.o" "../FreeRTOS/portable/port.c"
C:\Users\GL\AppData\Local\Temp\cczJmjjA.s: Assembler messages:
C:\Users\GL\AppData\Local\Temp\cczJmjjA.s:389: Error: selected processor does not support Thumb mode `vstmdbeq r0!,{s16-s31}'
C:\Users\GL\AppData\Local\Temp\cczJmjjA.s:390: Error: instruction not allowed in IT block -- `stmdb r0!,{r14}'
C:\Users\GL\AppData\Local\Temp\cczJmjjA.s:406: Error: selected processor does not support Thumb mode `vldmiaeq r0!,{s16-s31}'
C:\Users\GL\AppData\Local\Temp\cczJmjjA.s:407: Error: instruction not allowed in IT block -- `ldmia r0!,{r4-r11}'
make: *** [FreeRTOS/portable/port.o] Error 1

What compiler version are you using? Which options do you pass to avoid problems like that?


RE: STM32F4 with FPU

Posted by Richard on December 15, 2011
Note that FreeRTOS V7.1.0 has two basic Cortex-M4F ports now, one for IAR and one for Keil. GCC is the next on the hit list.

The errors seem to be telling you that GCC is not expecting floating point instructions to be present. I have not tried using GCC with an M4F yet, but looking at your command line, and your output I would suggest that either you need to define the CPU as Cortex-M4F rather than just Cortex-M4 (not all Cortex-M4s have a floating point unit), or that you need to manually tell GCC that a hardware floating point unit is being used via a separate command line option.

That assumes the version of GCC you are using supports an M4F, of course.


RE: STM32F4 with FPU

Posted by thomask77 on December 21, 2011

I'm using the codesourcery toolchain with the following options:

-mcpu=cortex-m4 -mthumb -mfpu=fpv4-sp-d16 -mfloat-abi=softfp

Keep in mind that the FPU is single precision only. So you should use sqrtf() instead of sqrt() to prevent double precision emulation calls.

You should also try


To treat float literals as single precision. Otherwise, a term like x = x * 0.123 will call a double precision library function (or write 0.123f, which I find quite awkward).

have fun!

RE: STM32F4 with FPU

Posted by cd334 on January 3, 2012

I have tried out your Cortex-M4F port ver 0.2. with the STM32F4-Discovery board.
I use Mentor CodeSourcery Lite GCC compiler (2011.09-69-arm-none-eabi).

I can compile your code, with the compiler flags:
-mcpu=cortex-m4 -mthumb -mfpu=fpv4-sp-d16 -mfloat-abi=softfp

But i have a problem at xTaskCreate funtion.
My program hangs in this function.

My program slice:

portBASE_TYPE task_create_LED;

task_create_LED = xTaskCreate( prvLEDTask, ( signed char * ) "Led", configMINIMAL_STACK_SIZE, NULL, mainLED_TASK_PRIORITY, NULL );
if (task_create_LED == pdPASS) printf(" LED Task Created!\r\n");
else printf(" LED Task Create FAILED! Err. Code: %u!\r\n",task_create_LED);


If I look deeper with a SWD debugger the program hang in task.c xTaskGenericCreate function at line:
/* Check the alignment of the initialised stack. */
portALIGNMENT_ASSERT_pxCurrentTCB( ( ( ( unsigned long ) pxNewTCB->pxTopOfStack & ( unsigned long ) portBYTE_ALIGNMENT_MASK ) == 0UL ) );

Can you look depper in your code? With the official Cortex-M3 port without FPU works well.

Best Regards!

RE: STM32F4 with FPU

Posted by cd334 on January 3, 2012

I have forgotten:
If i can help (futher setup, makefile, code or somteing important), please write me.
I will send you my details.

Thank you!

Best Regards!

RE: STM32F4 with FPU

Posted by Richard on January 4, 2012
I know the port you are using is not the official FreeRTOS port, but I think if you update to the FreeRTOS V7.1.0 code (and use the same contributed port layer as you are now), then you might find the problem doesn't exist.

To know if there really is a problem, set a break point on entry to a task (before the task function prologue assembly code manipulates the stack pointer to create a stack frame for the task function), then check to see if the stack pointer is 8 byte aligned.


RE: STM32F4 with FPU

Posted by cd334 on January 4, 2012

Thank you for your help!

I know that is an unofficial Cortex-M4F port.
I wait for the offical Cortex-M4F gcc port. When would you release it? :)
I had some free time, i though I try the FPU with FreeRTOS out.
I use the latest V7.1.0 version of FreeRTOS.

I have checked what you say, and yes the stack pointer is not 8 byte aligned.
When i set in the new portmacro.h the
#define portBYTE_ALIGNMENT4

the unofficial port works with FPU.

What is the significance of the aligment settings? What happens when I leave it at 4? Or at STM32F407 must be 8?

Best Regards!

RE: STM32F4 with FPU

Posted by Richard on January 4, 2012
What is the significance of the aligment settings? What happens when I leave it at 4? Or at STM32F407 must be 8?

You probably won't notice any problems with it at four until you use 64 bit numbers, or use a library function that makes assumptions about how 64 bit numbers are stored. The most common symptom is getting an incorrect value for a printf() with a floating point modifier.


RE: STM32F4 with FPU

Posted by thomask77 on January 19, 2012

I just uploaded a (really) minimal demo project for my port:


have fun,
Thomas Kindler

RE: STM32F4 with FPU

Posted by Sasha Zbrozek on January 25, 2012
Is there any word on when/if this unofficial port will be made official? Or if there will be an official port sometime in the nearish future?


RE: STM32F4 with FPU

Posted by johnDS on January 31, 2012
If I run with a single task the assert fails for the line in bold. More specifically, if I remove the comments from the task create below, everything will work.

void DebugUART::Start()
// Init the Debug UART then start the task.
// xTaskCreate( vDebugUARTOutputTask, (signed char *) "DebugUART", configMINIMAL_STACK_SIZE,
//NULL, mainDEBUG_UART_TASK_PRIORITY, &hDebugOutputTask );

My solution was simply to run with two tasks. I don't know if this is a bug or something I am doing wrong. By the way. Thank you for the port. The STM32F4 series seems very nice in many respects. It is nice to have a FreeRTOS port for it.

void vTaskSwitchContext( void )
while( listLIST_IS_EMPTY( &( pxReadyTasksLists[ uxTopReadyPriority ] ) ) )
configASSERT( uxTopReadyPriority );

/* listGET_OWNER_OF_NEXT_ENTRY walks through the list, so the tasks of the
same priority get an equal share of the processor time. */
listGET_OWNER_OF_NEXT_ENTRY( pxCurrentTCB, &( pxReadyTasksLists[ uxTopReadyPriority ] ) );



[ Back to the top ]    [ About FreeRTOS ]    [ Sitemap ]    [ ]

Copyright (C) 2004-2010 Richard Barry. Copyright (C) 2010-2016 Real Time Engineers Ltd.
Any and all data, files, source code, html content and documentation included in the FreeRTOSTM distribution or available on this site are the exclusive property of Real Time Engineers Ltd.. See the files license.txt (included in the distribution) and this copyright notice for more information. FreeRTOSTM and FreeRTOS.orgTM are trade marks of Real Time Engineers Ltd.

Latest News:

FreeRTOS V9.0.0 is now available for download.

Free TCP/IP and file system demos for the RTOS

Sponsored Links

⇓ Now With No Code Size Limit! ⇓
⇑ Free Download Without Registering ⇑

FreeRTOS Partners

ARM Connected RTOS partner for all ARM microcontroller cores

Renesas Electronics Gold Alliance RTOS Partner.jpg

Microchip Premier RTOS Partner

RTOS partner of NXP for all NXP ARM microcontrollers

Atmel RTOS partner supporting ARM Cortex-M3 and AVR32 microcontrollers

STMicro RTOS partner supporting ARM7, ARM Cortex-M3, ARM Cortex-M4 and ARM Cortex-M0

Xilinx Microblaze and Zynq partner

Silicon Labs low power RTOS partner

Altera RTOS partner for Nios II and Cortex-A9 SoC

Freescale Alliance RTOS Member supporting ARM and ColdFire microcontrollers

Infineon ARM Cortex-M microcontrollers

Texas Instruments MCU Developer Network RTOS partner for ARM and MSP430 microcontrollers

Cypress RTOS partner supporting ARM Cortex-M3

Fujitsu RTOS partner supporting ARM Cortex-M3 and FM3

Microsemi (previously Actel) RTOS partner supporting ARM Cortex-M3

Atollic Partner

IAR Partner

Keil ARM Partner

Embedded Artists