Failed to allocate heap

Hello there, I am using FreeRTOS V9.0.0. the code is added at project generation in the CubeMX software from ST for their STM32 targets. In the project I am working on, I have noticed random, infrequent resets of the device (every couple hours, sometimes even days). The device resets itself on purpose, after entering a critical section. Please consider this function, which is a malloc wrapper (I am using heap4): ~~~ /** * @brief FreeRTOS ref pvPortMalloc wrapper with logging. * @param size: amount of bytes to allocate on heap. * @return non zero if allocation was successful. / void utilmalloc(const sizet size) { assert_param(size);
// saved before malloc is called
const size_t freeHeap = xPortGetFreeHeapSize();
void* ptr = pvPortMalloc(size);
const size_t freeHeapAfter = xPortGetFreeHeapSize();

if (freeHeapAfter >= freeHeap)
{
    log_PushLine(e_logLevel_Critical,
            "Failed to allocate heap. pre %u, after %u, needed %u",
            freeHeap, freeHeapAfter, size);
     // program counter wont reach here, since Critical log kills the thread
    ptr = NULL;
}

return ptr;
} ~~~ In those rare situations, the logs reveal such traces: ~~~ [2018-11-27T13:34:14.761Z][tUartRx ][001]: Failed to allocate heap. pre 2776, after 4160556831, needed 1 [2018-11-28T06:05:55.698Z][tUartRx ][001]: Failed to allocate heap. pre 2784, after 4160556839, needed 6 [2018-11-28T09:32:50.874Z][tUartRx ][001]: Failed to allocate heap. pre 2776, after 4160556831, needed 1 [2018-11-29T00:37:42.917Z][tNmMain ][005]: Failed to allocate heap. pre 2904, after 4160556959, needed 17 [2018-11-29T00:37:44.494Z][tUartRx ][001]: Failed to allocate heap. pre 4160561167, after 4160561167, needed 144 etc ~~~ I cannot understand where does the free heap value after allocation come from. I would appreciate all help regarding debugging this issue.

Failed to allocate heap

You can see here: https://sourceforge.net/p/freertos/code/HEAD/tree/trunk/FreeRTOS/Source/portable/MemMang/heap_4.c#l311 that xPortGetFreeHeapSize() just returns a variable. If the allocation failed, then the variable will not have changed, so the before and after values should be the same – so I’m guessing the allocation didn’t fail but you have some strange integer promotion issue going on. Have a look at how to use the malloc failed hook here: https://www.freertos.org/a00016.html

Failed to allocate heap

One comment on things that could trigger your assertion (but shouldn’t give the big (negative) number you are getting) would be if another task does a free between your two calls to xPortGetFreeHeapSize(). The pvPortMalloc() call has protection from re-entrancy, but that won’t protect your wrapper. The big value may be sign that something is doing a ‘wild write’ and corrupting the heap.

Failed to allocate heap

Ah yes, good catch.

Failed to allocate heap

Hello guys, thank you for answers. After I wrote this topic I thought of the re-entrancy of my wrapper. Thus, I added a mutex before each malloc and free: ~~~ /** * @brief FreeRTOS ref pvPortMalloc wrapper with logging and mutex lock * @param size: amount of bytes to allocate on heap. * @return non zero if allocation was successful. / void utilmalloc(const sizet size) { assert_param(size);
if (!size)
    return NULL;

UTIL_UNIQUE_LOCK(utilTasks.mallocMut);

// saved before malloc is called
const size_t freeHeap = xPortGetFreeHeapSize();
void* ptr = pvPortMalloc(size);
const size_t freeHeapAfter = xPortGetFreeHeapSize();

if (freeHeapAfter >= freeHeap)
{
    log_PushLine(e_logLevel_Critical,
            "Heap alloc failure. pre %u, after %u, needed %u, ptr 0x%X",
            freeHeap, freeHeapAfter, size, (uint32_t)ptr);

    ptr = NULL;
}

return ptr;
} ~~~ ~~~ /** * @brief FreeRTOS ref vPortFree wrapper with logging and mutex lock * @param pv: pointer to the memory which has to be free’d. / void util_free(void pv) { assert_param(pv);
if (!pv)
    return;

UTIL_UNIQUE_LOCK(utilTasks.mallocMut);
vPortFree(pv);
} ~~~ ~~~ /** * @brief A macro wrapper for mutex taking and releasing. * * When used in a function body, the mutex is taken immedietally and * released only after the function returns (automatically, RAII style) * * @param id: the mutex identifier variable. NOTE: this has to be the actual * mutex variable, since the ref utilmutexCreate and * ref utilmutexReleasePtr functions works on a pointer to this * variable. Also the p id mutex is only initialized once by the * ref util_mutexCreate functions. * @return none. */

define UTILUNIQUELOCK(id)

util_mutexCreate((osMutexId* const)&id);                                
util_mutexWait(id, osWaitForever);                                      
osMutexId thyId __attribute__((cleanup(util_mutexReleasePtr))) = id;    
~~~ ~~~ /** * @brief ref utilmutexRelease wrapper for usage with RAII macro. * @param id: pointer to the mutex id. * @return same as ref utilmutexRelease. / HAL_StatusTypeDef util_mutexReleasePtr(osMutexId id) { assertparam(id); return utilmutexRelease(*id); } ~~~ For the heap allocation/ dealocation I am using only my wrappers in the code, so the heap corruption could come only from them. Do you think a rare race condition could be the main case here like you mentioned? Richard- after what you have said, does that mean my “manual” heap check doesnt even make sense, as I should only rely on what pvMalloc returns and the eventual malloc failed hook?

Failed to allocate heap

You don’t show how your mutex utility code works, but unless util)mutexCreate has the smarts to check if the passed handle is already created and the thyID code is setting up an object that releases the mutex at the end of scope (are you using C++ here?) then your code isn’t protecting itself, as each call would be using a new mutex. Also, you aren’t protecting against FreeRTOS itself freeing memory as it won’t go through your wrapper. I am not sure what your purpose is on checking that the heap free space has grown during a call, as that is really only checking that the heap function itself is working (which is well tested code), and not anything about your own code. From the results that you showed, my guess is something else is corrupting memory, causing the very big numbers, somehow being normally syncronized with this call (something else, maybe higher priority, doing a heap call that doesn’t like being delayed a bit by the heap syncronization?).

Failed to allocate heap

Hi, thank you for answer. The function creating the mutex is singleton style- it will create it only once: ~~~ /** * @brief Initializes the p id mutex, but only if uninitialized * @param id: Mutex to be initialized. * @return ref HALOK on succesfull init or if already initialized. */ HALStatusTypeDef utilmutexCreate(osMutexId* const id) { assertparam(id);
if (!id)
    return HAL_ERROR;

if (*id)
    return HAL_OK; // already initialized

osMutexDef_t tmp;
if (!(*id = osMutexCreate(&tmp)))
{
    log_PushLine(e_logLevel_Critical, "Failed to create mutex");
    return HAL_ERROR;
}

return HAL_OK;
} ~~~ As for the other stuff, I am not using C++. This is a C extension: http://echorand.me/site/notes/articles/ccleanup/cleanupattributec.html When the function calling UTILUNIQUELOCK macro returns, just before that a provided in the macro function is called. In my situation, it is the mutex release function. This is a mere implementation of C++’s uniquelock. The check of the heap was just for logging purposes and I thought it would never trigger, but it did, thus this all concern. I am not a FreeRTOS API expert, but what I have noticed so far, the only moment in which the heap is altered, is when one cretes objecs (like mutexes, semaphores, queues etc) and when one allocates memory or frees it. All my objects are created at MCU startup before the scheduler starts (apart from this create mutex implementation I just added now for testing purposes). While the scheduller is running, I only allocate and free memory using my wrappers. After the further info I provided, would you say that this malloc and free wrappers are thread safe? Its really hard for me to figure out what else could cause the memory leak/ seg faults. Like I said, this happens so, so rarelly and infrequently. I would appreciate all further suggestions.

Failed to allocate heap

If you always call through your wrappers, and never delete any FreeRTOS objects then you should be ‘thread safe’. My point was that testing for the heap free space not changing as you expect isn’t really testing any of your code, and is an awkward test for the allocation failing (you should just test that the results are null). The issue of the strange values likely isn’t likely a heap issue, unless some code is freeing a block with an address that didn’t come from the allocation function. More likely some operation is doing a ‘wild write’ (perhaps overrunning a near by array, maybe look at the link map) and breaking the heap functions. The interesting thing is that most of the cases seem to have the corruption occuring between the start and end of the allocation wrapper (but likely NOT directly due to that thread itself, as that code looks pretty safe. The one thing I can think of is that during this call the scheduler is disabled for a little bit while you are inside pvPortMalloc, and perhaps some ISR is trying to activate a task and something doesn’t like it not happening fast enough. My though here is that 4 of the 5 failures had reasonable values at the start of the function, and it went bad during it, so something fairly unique must be happening that triggers the condition.

Failed to allocate heap

Thank you for answer. This would be a very interesting case. I must look into the map file, as well as go through all isr’s I utilize and check for memory allocs there. As far as I remember though, in isr’s I only put items to queues and release semaphores, or trigger signals. Also, like you mentioned, I never remove the created objects (like tgreads and queues).

Failed to allocate heap

Don’t just look at ‘heap’ stuff. If an ISR is filling a buffer, or writes through a pointer, and is expecting this to be properly set up, and this might not happen if a task that is activated gets delayed, you could have an issue (If an ISR is actually allocating memory by calling the allocation functions, you have broken the FreeRTOS rules, as there is no FromISR memory allocation functions).

Failed to allocate heap

You are right, uaing the api there is no way to alloc memory from isr. In that case I am surely not doing that. But a memory brock overwrite that writes over the rtoa heap array sounds promising. I will look for that, thank you.

Failed to allocate heap

If I remember right (don’t have the code handy) heap4 doesn’t actually parse the heap to get the remaining space, but keeps track of the amount of memory available as a single variable, so the overwrite would be of that variable, not the heap itself.

Failed to allocate heap

Yes, thats actually what I meant, but written something else… This shod make it easier to find thwn I hope. Thanj you very much.

Failed to allocate heap

Hi Richard, hope you are having a good holidays, Looking at the map file its not easy to point the problem here: ~~~ fill 0x2000077d 0x3 .bss.FatFs 0x20000780 0x8 MiddlewaresThirdPartyFatFssrcff.o .bss.Fsid 0x20000788 0x2 MiddlewaresThirdPartyFatFssrcff.o fill 0x2000078a 0x2 .bss.Files 0x2000078c 0x60 MiddlewaresThirdPartyFatFssrcff.o .bss.disk 0x200007ec 0x10 MiddlewaresThirdPartyFatFssrcffgendrv.o 0x200007ec disk .bss.ucMaxSysCallPriority 0x200007fc 0x1 MiddlewaresThirdPartyFreeRTOSSourceportableGCCARMCM4Fport.o fill 0x200007fd 0x3 .bss.ulMaxPRIGROUPValue 0x20000800 0x4 MiddlewaresThirdPartyFreeRTOSSourceportableGCCARMCM4Fport.o .bss.ucHeap 0x20000804 0xc000 MiddlewaresThirdPartyFreeRTOSSourceportableMemMangheap4.o .bss.xStart 0x2000c804 0x8 MiddlewaresThirdPartyFreeRTOSSourceportableMemMangheap4.o .bss.pxEnd 0x2000c80c 0x4 MiddlewaresThirdPartyFreeRTOSSourceportableMemMangheap4.o .bss.xFreeBytesRemaining 0x2000c810 0x4 MiddlewaresThirdPartyFreeRTOSSourceportableMemMangheap4.o .bss.xMinimumEverFreeBytesRemaining 0x2000c814 0x4 MiddlewaresThirdPartyFreeRTOSSourceportableMemMangheap4.o .bss.xBlockAllocatedBit 0x2000c818 0x4 MiddlewaresThirdPartyFreeRTOSSourceportableMemMangheap4.o .bss.pxCurrentTCB 0x2000c81c 0x4 MiddlewaresThirdPartyFreeRTOSSourcetasks.o 0x2000c81c pxCurrentTCB .bss.pxReadyTasksLists 0x2000c820 0x8c MiddlewaresThirdPartyFreeRTOSSourcetasks.o .bss.xDelayedTaskList1 0x2000c8ac 0x14 MiddlewaresThirdPartyFreeRTOSSourcetasks.o .bss.xDelayedTaskList2 0x2000c8c0 0x14 MiddlewaresThirdPartyFreeRTOSSourcetasks.o .bss.pxDelayedTaskList 0x2000c8d4 0x4 MiddlewaresThirdPartyFreeRTOSSourcetasks.o .bss.pxOverflowDelayedTaskList 0x2000c8d8 0x4 MiddlewaresThirdPartyFreeRTOSSourcetasks.o .bss.xPendingReadyList 0x2000c8dc 0x14 MiddlewaresThirdPartyFreeRTOSSourcetasks.o .bss.xTasksWaitingTermination 0x2000c8f0 0x14 MiddlewaresThirdPartyFreeRTOSSourcetasks.o .bss.uxDeletedTasksWaitingCleanUp 0x2000c904 0x4 MiddlewaresThird_PartyFreeRTOSSourcetasks.o ~~~ The variable is mapped as follows: ~~~ .bss.xFreeBytesRemaining 0x2000c810 0x4 MiddlewaresThirdPartyFreeRTOSSourceportableMemMangheap4.o ~~~ The firther addresses is more FreeRTOS, the earlier addresses is FatFs, which is a pretty solid code as well. What do you think?

Failed to allocate heap

It looks like the thing just before is the heap, so if you write past the end of an object allocated on the heap, you can overwrite the variariable.

Failed to allocate heap

Thanks for answer Richard, So looking at the map, the first thing before the xFreeBytesRemaining that is not related to FreeRTOS and CMSIS is the disk variable, which is part of the FatFs. Looking inside that struct: ~~~ Disk_drvTypeDef disk = {{0},{0},{0},0}; … /** * @brief Global Disk IO Drivers structure definition */ typedef struct { uint8_t is_initialized[_VOLUMES]; const Diskio_drvTypeDef *drv[_VOLUMES]; uint8_t lun[_VOLUMES]; volatile uint8_t nbr; }Disk_drvTypeDef; ~~~ There are couple arrays. Their size depends on the number of volumes available in the system. For me it is 2 volumes. Maybe at some point the index exceeds 1 here… I think I will start with turning the system on with debugger probe attached and breakpoint placed in the place where free heap variable is corrupted. It might be already too late for finding enything, but I could check the FatFs variables values at that point while halted. I was trying to find the error statically, but with no luck so far.

Failed to allocate heap

Hi Richard, Just wanted to write an update on further tests. I was quite desparate, as couldnt find the reason for this bug for a while. I consulted a friend, and he suggested that since I am running on an 32 bit architecture (Arm cortex m4), maybe my problem is due to the alignement issues. I was sceptic about this, but had to give it a try. I modified the malloc wrapper as follows: ~~~ /** * @brief The ref util_malloc function will allocate amounts of bytes * that divided by this value give 0. */

define UTILMALLOCDIVIDER 4

/** * @brief FreeRTOS ref pvPortMalloc wrapper with logging and mutex lock * @param size: amount of bytes to allocate on heap. * @return non zero if allocation was successful. / void utilmalloc(const sizet size) { assert_param(size);
if (!size)
    return NULL;

// Create actual allocated size
const size_t mod = size % UTIL_MALLOC_DIVIDER;
const size_t endSize = (!mod) ? size : size + UTIL_MALLOC_DIVIDER - mod;

UTIL_UNIQUE_LOCK(utilTasks.mallocMut);

// saved before malloc is called
const size_t freeHeap = xPortGetFreeHeapSize();
void* ptr = pvPortMalloc(endSize);
const size_t freeHeapAfter = xPortGetFreeHeapSize();

if ((freeHeapAfter >= freeHeap) || !ptr)
{
    log_PushLine(e_logLevel_Critical,
            "Heap alloc failure. pre %u, after %u, needed %u, ptr 0x%X",
            freeHeap, freeHeapAfter, endSize, (uint32_t)ptr);

    ptr = NULL;
}

return ptr;
} ~~~ So now I always allocate an amount of memory that is a multiplication of 4 (and always at least 4). Since I did this, the problem did not occur yet, the program is running for some days now. I am aware that this is no proof and the problem can as well occur in a week or a month, but previously the problem occured within couple days max. Also, before introducing this allocation method, I added the ptr pointer print each time I enter the faulty if clause. Each time the amount of heap was faulty, the malloc function returned 0x8 for the ptr. Since my heap is an array, this seems alright. But on the other hand, a lot of things are allocated on the heap during startup, so this 0x8 value seems a bit small? What do you think?

Failed to allocate heap

Not read all this thread, but if it helps, pvPortMalloc() will always ensure the alignment is correct, so if there was an alignment issue it could only have been in the wrapper code rather than in the memory allocater itself (assuming you are using one of the allocators that comes in the FreeRTOS download).

Failed to allocate heap

Hi Richard, thank you for answer. Yes, I am using the pvPortMalloc function in my wrapper. The wrapper cannot really do a bad alignement, since it is not responsible for that, because its using the pvPortMalloc function… Or maybe I got it wrong that you have wrote?