memcpy NEON support for Cyclone V based architectures build with Altera Quartus 16.1

Hi, are there any plans for FreeRTOS to support a NEON optimized version of memcpy? Or is this working already an I am just not using it correctly? When building with certain compiler flags for the above architecture ‘memcpy’ will use NEON internally. Now when FreeRTOS is running in interrupts as well as during task switches memcpy might be interrupted which results in the data stored in certain registers being lost/overwritten: -mfpu=neon memcpy() will use 4*32Byte NEON operations VLD und VST VLD1.8 {d0,d1,d2,d3}, [r1]! VLD1.8 {d4,d5,d6,d7}, [r1]! VST1.8 {d0,d1,d2,d3}, [r0@128]! VLD1.8 {d0,d1,d2,d3}, [r1]! VST1.8 {d4,d5,d6,d7}, [r0@128]! VLD1.8 {d4,d5,d6,d7}, [r1]! VST1.8 {d0,d1,d2,d3}, [r0@128]! VST1.8 {d4,d5,d6,d7}, [r0@128]! However none of the registers d0 – d7 seem to be stored in case of a task switch… I am not entrily sure if I explained this correctly so please have mercy if I didn’t. I can provide additional information. Regards, Stefan

memcpy NEON support for Cyclone V based architectures build with Altera Quartus 16.1

Hi Stefan, I’m not sure if this recent topic is useful?

memcpy NEON support for Cyclone V based architectures build with Altera Quartus 16.1

are there any plans for FreeRTOS to support a NEON optimized version of memcpy?
It already does. I’m not sure if this was in FreeRTOS V9.0.0 but its definitely in V9.0.1 (which is only tagged in SVN, rather than provided as a .zip file). In that version you have the choice to waist CPU cycles and RAM by giving every task a floating point context, and likewise for every nested interrupt – an overhead that outweighs any benefit that is obtained by using the NEON registers in calls to memcpy().

memcpy NEON support for Cyclone V based architectures build with Altera Quartus 16.1

Thanks! However on our Cortex-A9 processor there is a FPU unit as well as a NEON unit. Is FreeRTOS 9.0.1 aware which unit is used and which registers have to be saved or is there only support for one of these units and in that case which unit is actually supported?

memcpy NEON support for Cyclone V based architectures build with Altera Quartus 16.1

Which registers are used by one but not the other?

memcpy NEON support for Cyclone V based architectures build with Altera Quartus 16.1

Sorry! I guess you are correct! There are no such registers used by one unit only… We will try the latest FreeRTOS version then! Thanks a lot!!!

memcpy NEON support for Cyclone V based architectures build with Altera Quartus 16.1

Hi! I’m also experimenting with FreeROTS on Altera’s SOCFPGA platform. One problem of the toolchain is the NEON optimazation in memcpy() as mentioned. The other problem is that it doesn’t support -mfloat-abi=hard. I therefore built my own GCC toolchain with Crosstool-NG. You might be interested to try it out: https://github.com/thomask77/ct-ng-toolchains Best regards, – thomas