Quality RTOS & Embedded Software

 Real time embedded FreeRTOS RSS feed 
Quick Start Supported MCUs PDF Books Trace Tools Ecosystem TCP & FAT



Posted by daveskok on May 18, 2016

Greetings, I have been looking closely at FreeRTOS+TCP(160112) source and examples related to implementing zero copy support. I have also read the following threads related to providing newtork buffers and alignment.

https://sourceforge.net/p/freertos/discussion/382005/thread/5591a1a9/?limit=25#c87d/59a7/e3fd https://sourceforge.net/p/freertos/discussion/382005/thread/0480e081/?limit=25#0ffa/067a https://sourceforge.net/p/freertos/discussion/382005/thread/cc8075ef/?limit=25#c143

It is clear to me why the padding is used in the buffer, I do that kind of thing in my own code often. What I am having a hard time understanding is implementation for a case where ethernet hardware requires buffers on 32bit alignment. First assume that the buffers are sized max packet + ipBUFFERPADDING and are indeed aligned on even 32bit boundaries. In this case I reason that the value of ipconfigPACKETFILLERSIZE would need to be 0 or possibly 4. Assuming that this assertion is so the value of ipBUFFERPADDING would be an even multiple of 32 bits. The result is that the buffer (base addr + ipBUFFERPADDING) that harware uses is on even 32bit boundary and there is still space in front of the hardware buffer for FreeRTOS+TCP to use. If I understand correctly this will satisfy hardware requirements but the consequence is that FreeRTOS+TCP will be somewhat less efficient when it "cracks" the packets because fields are more favorably aligned when ipconfigPACKETFILLER_SIZE is 2. Is this right?



Posted by heinbali01 on May 19, 2016

If I understand correctly this will satisfy hardware requirements but the consequence is that FreeRTOS+TCP will be somewhat less efficient when it "cracks" the packets because fields are more favourably aligned when ipconfigPACKETFILLERSIZE is 2. Is this right?

You see this perfectly right !

There is a conflict between the requirements of the IP-stack and the DMA buffers.

But fortunately, many makers of EMAC peripherals know about this problem and created a way to get around this: a flag that says:

"ignore the first 2 bytes of my TX data"

and there may be another flag saying:

"insert 2 dummy bytes before the RX packet"

What hardware are you using? The above configuration flags may also be available.

A summary of the ethernet buffer:

~~~~ Invisible 10 bytes, 32-bit aligned, at pucEthernetBuffer - 10:

Offs Contents

  /* Pointer to the owner of this array. */

0 NetworkBufferDescriptort *pxBackPointer; 4 uint32t ulSpare; /* Filler to get a 32-bit alignment PLUS 2. */ 8 uint16_t usFiller;

Here start the visible data, at pucEthernetBuffer + 0:

14-byte Ethernet header:

10 uint8t ucDestination[ 6 ]; 16 uint8t ucSource[ 6 ]; 22 uint16_t usFrameType;

IP-header, 32-bit aligned:

24 uint8t ucVersionHeaderLength; ... /* 32-bit fields: */ 36 uint32t ulSourceIPAddress; 40 uint32_t ulDestinationIPAddress;


Some hardware is able to access 32-bit variables at 16-bit aligned locations. I saw a CPU that can do this with internal SRAM only.

In cases where the compiler knows in advance that a variable is badly aligned, such as here:

struct xUnaligned {
    uint8_t ucChar;
    uint32_t ulLong;
} __attribute__( ( packed ) );

the compiler may get around the problem and access 'ulLong' as an array of 4 bytes.

For FreeRTOS+TCP it was decided to give all network packets a perfect alignment so that all 32-bit fields will be accessed with 32-bit instructions. It is the 14-byte Ethernet header that spoils the party.

PS At higher levels (such as DNS, LLMNR, DHCP and NBNS), no assumptions can be made about the alignment of 32-bit fields and memcpy() is used.



Posted by daveskok on May 19, 2016

Hein, Thank you very much for clarifying this! I am investigating for the purpose of porting Microsemi A2F200 from LWIP to FreeRTOS+TCP. LWIP implementation currently uses copy. If we spend the time to port to FreeRTOS+TCP it appears that switching to zero copy is a simple matter but I am not confident that Microsemi hardware provides the trick you mention. I did see a comment in zero copy driver example Zynq/emacpsif_dma.c(629) that indicates hardware setting to accomodate shift and until now was not certain why unaligned transfer was strived for.

If hardware is not capable of unaligned reception of packets do you find that using zero copy still is a benefit? That is, time saved not copying buffers minus time added with unaligned packet cracking still comes out ahead?

Microsemi A2F200 is Cortex-M3 married to FPGA. Feature set is exotic and alluring. Once chosen and used in project a world of pain ensues. No user forum, frustrating tools and factory support only (read no support). If you've never heard of Microsemi forget the name now. Reader be warned. Apologies for the off topic rant.

Thanks again!

[ Back to the top ]    [ About FreeRTOS ]    [ Sitemap ]    [ ]

Copyright (C) Amazon Web Services, Inc. or its affiliates. All rights reserved.

Latest News

FreeRTOS kernel V10 is available for immediate download. Now MIT licensed.

FreeRTOS Partners

ARM Connected RTOS partner for all ARM microcontroller cores

IAR Partner

Microchip Premier RTOS Partner

RTOS partner of NXP for all NXP ARM microcontrollers

STMicro RTOS partner supporting ARM7, ARM Cortex-M3, ARM Cortex-M4 and ARM Cortex-M0

Texas Instruments MCU Developer Network RTOS partner for ARM and MSP430 microcontrollers

OpenRTOS and SafeRTOS