I'm currently working on an embedded project (STM32F103RB, CooCox CoIDE v.1.7.6 with arm-none-eabi-gcc 4.8 2013q4) and I'm trying to understand how <code>malloc()</code> behaves on plain <code>C</code> when the RAM is full. My STM32 has 20kB = 0x5000Bytes of RAM, 0x200 are used for the stack. <pre class="prettyprint"><code>#include <stdlib.h> #include "stm32f10x.h" struct list_el { char weight[1024]; }; typedef struct list_el item; int main(void) { item * curr; // allocate until RAM is full do { curr = (item *)malloc(sizeof(item)); } while (curr != NULL); // I know, free() is missing. Program is supposed to crash return 0; } </code></pre> I would expect <code>malloc()</code> to return <code>NULL</code> as soon as the heap is too small for allocating: <code>0x5000</code> (RAM) - <code>0x83C</code> (bss) - <code>0x200</code> (stack) = <code>0x45C4</code> (heap) So when executing the <code>malloc()</code> for the 18th time. One item is 1024=<code>0x400</code> Bytes large. But instead the uC calls the <code>HardFault_Handler(void)</code> after the 18th time (not even the <code>MemManager_Handler(void)</code>) Does anybody have an advice how to forecast a <code>malloc()</code> failure - since waiting for a <code>NULL</code> return doesn't seem to work. Thank you.

It does not look like <code>malloc</code> is doing any checks at all. The fault that you get comes from hardware detecting a write to an invalid address, which is probably coming from <code>malloc</code> itself. When <code>malloc</code> allocates memory, it takes a chunk from its internal pool, and returns it to you. However, it needs to store some information for the <code>free</code> function to be able to complete deallocation. Usually, that's the actual length of the chunk. In order to save that information, <code>malloc</code> takes a few bytes from the beginning of the chunk itself, writes the info there, and returns you the address past the spot where it has written its own information. For example, let's say you asked for a 10-byte chunk. <code>malloc</code> would grab an available 16-byte chunk, say, at addresses <code>0x3200..0x320F</code>, write the length (i.e. 16) into bytes 1 and 2, and return <code>0x3202</code> back to you. Now your program can use ten bytes from <code>0x3202</code> to <code>0x320B</code>. The other four bytes are available, too - if you call <code>realloc</code> and ask for 14 bytes, there would be no reallocation. The crucial point comes when <code>malloc</code> writes the length into the chunk of memory that it is about to return to you: the address to which it writes needs to be valid. It appears that after the 18-th iteration the address of the next chunk is negative (which translates to a very large positive) so CPU traps the write, and triggers the hard fault. In situations when the heap and the stack grow toward each other there is no reliable way to detect an out of memory while letting you use every last byte of memory, which is often a very desirable thing. <code>malloc</code> cannot predict how much stack you are going to use after the allocation, so it does not even try. That is why the byte counting in most cases is on you. In general, on embedded hardware when the space is limited to a few dozen kilobytes, you avoid <code>malloc</code> calls in "arbitrary" places. Instead, you pre-allocate all your memory upfront using some pre-calculated limits, and parcel it out to structures that need it, and never call <code>malloc</code> again.

malloc behaviour on an embedded system

Tags:

c

memory-management

malloc

out-of-memory

stm32

I'm currently working on an embedded project (STM32F103RB, CooCox CoIDE v.1.7.6 with arm-none-eabi-gcc 4.8 2013q4) and I'm trying to understand how malloc() behaves on plain C when the RAM is full.

My STM32 has 20kB = 0x5000Bytes of RAM, 0x200 are used for the stack.

#include <stdlib.h> #include "stm32f10x.h"  struct list_el {    char weight[1024]; };  typedef struct list_el item;  int main(void) {     item * curr;      // allocate until RAM is full     do {         curr = (item *)malloc(sizeof(item));     } while (curr != NULL);      // I know, free() is missing. Program is supposed to crash      return 0; }

I would expect malloc() to return NULL as soon as the heap is too small for allocating:

0x5000 (RAM) - 0x83C (bss) - 0x200 (stack) = 0x45C4 (heap)

So when executing the malloc() for the 18th time. One item is 1024=0x400 Bytes large.

But instead the uC calls the HardFault_Handler(void) after the 18th time (not even the MemManager_Handler(void))

Does anybody have an advice how to forecast a malloc() failure - since waiting for a NULL return doesn't seem to work.

Thank you.

890

asked Mar 15 '14 10:03

Boern

2 Answers

It does not look like malloc is doing any checks at all. The fault that you get comes from hardware detecting a write to an invalid address, which is probably coming from malloc itself.

When malloc allocates memory, it takes a chunk from its internal pool, and returns it to you. However, it needs to store some information for the free function to be able to complete deallocation. Usually, that's the actual length of the chunk. In order to save that information, malloc takes a few bytes from the beginning of the chunk itself, writes the info there, and returns you the address past the spot where it has written its own information.

For example, let's say you asked for a 10-byte chunk. malloc would grab an available 16-byte chunk, say, at addresses 0x3200..0x320F, write the length (i.e. 16) into bytes 1 and 2, and return 0x3202 back to you. Now your program can use ten bytes from 0x3202 to 0x320B. The other four bytes are available, too - if you call realloc and ask for 14 bytes, there would be no reallocation.

The crucial point comes when malloc writes the length into the chunk of memory that it is about to return to you: the address to which it writes needs to be valid. It appears that after the 18-th iteration the address of the next chunk is negative (which translates to a very large positive) so CPU traps the write, and triggers the hard fault.

In situations when the heap and the stack grow toward each other there is no reliable way to detect an out of memory while letting you use every last byte of memory, which is often a very desirable thing. malloc cannot predict how much stack you are going to use after the allocation, so it does not even try. That is why the byte counting in most cases is on you.

In general, on embedded hardware when the space is limited to a few dozen kilobytes, you avoid malloc calls in "arbitrary" places. Instead, you pre-allocate all your memory upfront using some pre-calculated limits, and parcel it out to structures that need it, and never call malloc again.

answered Oct 08 '22 03:10

Sergey Kalinichenko

Your program most likely crashes because of an illegal memory access, which is almost always an indirect (subsequent) result of a legal memory access, but one that you did not intend to perform.

For example (which is also my guess as to what's happening on your system):

Your heap most likely begins right after the stack. Now, suppose you have a stack-overflow in main. Then one of the operations that you perform in main, which is naturally a legal operation as far as you're concerned, overrides the beginning of the heap with some "junk" data.

As a subsequent result, the next time that you attempt to allocate memory from the heap, the pointer to the next available chunk of memory is no longer valid, eventually leading to a memory access violation.

So to begin with, I strongly recommend that you increase the stack size from 0x200 bytes to 0x400 bytes. This is typically defined within the linker-command file, or through the IDE, in the project's linker settings.

If your project is on IAR, then you can change it in the icf file:

define symbol __ICFEDIT_size_cstack__ = 0x400

Other than that, I suggest that you add code in your HardFault_Handler, in order to reconstruct the call-stack and register values prior to the crash. This might allow you to trace the runtime error and find out exactly where it happened.

In file 'startup_stm32f03xx.s', make sure that you have the following piece of code:

EXTERN  HardFault_Handler_C        ; this declaration is probably missing  __tx_vectors                       ; this declaration is probably there     DCD     HardFault_Handler

Then, in the same file, add the following interrupt handler (where all other handlers are located):

    PUBWEAK HardFault_Handler     SECTION .text:CODE:REORDER(1) HardFault_Handler     TST LR, #4     ITE EQ     MRSEQ R0, MSP     MRSNE R0, PSP     B HardFault_Handler_C

Then, in file 'stm32f03xx.c', add the following ISR:

void HardFault_Handler_C(unsigned int* hardfault_args) {     printf("R0    = 0x%.8X\r\n",hardfault_args[0]);              printf("R1    = 0x%.8X\r\n",hardfault_args[1]);              printf("R2    = 0x%.8X\r\n",hardfault_args[2]);              printf("R3    = 0x%.8X\r\n",hardfault_args[3]);              printf("R12   = 0x%.8X\r\n",hardfault_args[4]);              printf("LR    = 0x%.8X\r\n",hardfault_args[5]);              printf("PC    = 0x%.8X\r\n",hardfault_args[6]);              printf("PSR   = 0x%.8X\r\n",hardfault_args[7]);              printf("BFAR  = 0x%.8X\r\n",*(unsigned int*)0xE000ED38);     printf("CFSR  = 0x%.8X\r\n",*(unsigned int*)0xE000ED28);     printf("HFSR  = 0x%.8X\r\n",*(unsigned int*)0xE000ED2C);     printf("DFSR  = 0x%.8X\r\n",*(unsigned int*)0xE000ED30);     printf("AFSR  = 0x%.8X\r\n",*(unsigned int*)0xE000ED3C);     printf("SHCSR = 0x%.8X\r\n",SCB->SHCSR);                     while (1); }

If you can't use printf at the point in the execution when this specific Hard-Fault interrupt occurs, then save all the above data in a global buffer instead, so you can view it after reaching the while (1).

Then, refer to the 'Cortex-M Fault Exceptions and Registers' section at http://www.keil.com/appnotes/files/apnt209.pdf in order to understand the problem, or publish the output here if you want further assistance.

UPDATE:

In addition to all of the above, make sure that the base address of the heap is defined correctly. It is possibly hard-coded within the project settings (typically right after the data-section and the stack). But it can also be determined during runtime, at the initialization phase of your program. In general, you need to check the base addresses of the data-section and the stack of your program (in the map file created after building the project), and make sure that the heap does not overlap either one of them.

I once had a case where the base address of the heap was set to a constant address, which was fine to begin with. But then I gradually increased the size of the data-section, by adding global variables to the program. The stack was located right after the data-section, and it "moved forward" as the data-section grew larger, so there were no problems with either one of them. But eventually, the heap was allocated "on top of" part of the stack. So at some point, heap-operations began to override variables on the stack, and stack-operations began to override the contents of the heap.

answered Oct 08 '22 04:10

barak manos

Related questions
                            
                                What is the real difference between Pointers and References?
                            
                                Why can't I do ++i++ in C-like languages?
                            
                                Status of __STDC_IEC_559__ with modern C compilers
                            
                                C-to-hardware compiler (HLL synthesis) [closed]
                            
                                Why is the gcc math library so inefficient?
                            
                                Checking stack usage at compile time
                            
                                Will converting a string to a double equal the literal double?
                            
                                What does -D_DEFAULT_SOURCE do?
                            
                                What is a simple and reliable C library for working with Excel files? [closed]
                            
                                Looking for OpenCV tutorial [closed]
                            
                                How do *nix pseudo-terminals work ? What's the master/slave channel?
                            
                                Why use shm_open?
                            
                                Cost of push vs. mov (stack vs. near memory), and the overhead of function calls
                            
                                Why was getenv standardised but not setenv?
                            
                                Bitshift and integer promotion?
                            
                                Casting a void pointer to a struct
                            
                                Does cast between signed and unsigned int maintain exact bit pattern of variable in memory?
                            
                                Mixing C and assembly sources and build with cmake
                            
                                returning a pointer to a literal (or constant) character array (string)?
                            
                                declaring variables without any data type in c

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With