In the first answer here, the following was mentioned about the stack memory in C++: <blockquote> When a function is called, a block is reserved on the top of the stack for local variables and some bookkeeping data. </blockquote> This makes perfect sense on the top-level, and makes me curious about how smart compilers are when allocating this memory in and of itself, given the context of this question: Since braces themselves are not a stack frame in C (I assume this holds true for C++ as well), I want to check whether compilers optimize reserved memory based on variable scopes within a single function. In the following I'm assuming that the stack looks like this before a function call: <pre class="prettyprint"><code>-------- |main()| -------- <- stack pointer: space above it is used for current scope | | | | | | | | -------- </code></pre> And then the following after invoking a function <code>f()</code>: <pre class="prettyprint"><code>-------- |main()| -------- <- old stack pointer (osp) | f() | -------- <- stack pointer, variables will now be placed between here and osp upon reaching their declarations | | | | | | | | -------- </code></pre> For example, given this function <pre class="prettyprint"><code>void f() { int x = 0; int y = 5; int z = x + y; } </code></pre> Presumably, this will just allocate <code>3*sizeof(int)</code> + some extra overhead for bookkeeping. However, what about this function: <pre class="prettyprint"><code>void g() { for (int i = 0; i < 100000; i++) { int x = 0; } { MyObject myObject[1000]; } { MyObject myObject[1000]; } } </code></pre> Ignoring compiler optimizations which may elide a lot of stuff in the above since really they do nothing, I'm curious about the following in the second example: <ul> <li>For the <code>for</code> loop: will the stack space be large enough to fit all 100000 ints?</li> <li>On top of that, will the stack space contain <code>1000*sizeof(MyObject)</code> or <code>2000*sizeof(MyObject)</code>?</li> </ul> In general: does the compiler take variable scope into account when determining how much memory it will need for the new stack frame, before invoking a certain function? If this is compiler-specific, how do some well-known compilers do it?

The compiler will allocate space as needed (typically for all items at the beginning of the function), but not for each iteration in the loop. For example, what Clang produces, as LLVM-IR <pre class="prettyprint"><code>define void @_Z1gv() #0 { %i = alloca i32, align 4 %x = alloca i32, align 4 %myObject = alloca [1000 x %class.MyObject], align 16 %myObject1 = alloca [1000 x %class.MyObject], align 16 store i32 0, i32* %i, align 4 br label %1 ; <label>:1: ; preds = %5, %0 %2 = load i32, i32* %i, align 4 %3 = icmp slt i32 %2, 100000 br i1 %3, label %4, label %8 ; <label>:4: ; preds = %1 store i32 0, i32* %x, align 4 br label %5 ; <label>:5: ; preds = %4 %6 = load i32, i32* %i, align 4 %7 = add nsw i32 %6, 1 store i32 %7, i32* %i, align 4 br label %1 ; <label>:8: ; preds = %1 ret void } </code></pre> This is the result of: <pre class="prettyprint"><code>class MyObject { public: int x, y; }; void g() { for (int i = 0; i < 100000; i++) { int x = 0; } { MyObject myObject[1000]; } { MyObject myObject[1000]; } } </code></pre> So, as you can see, <code>x</code> is allocated only once, not 100000 times. Because only ONE of those variables will exist at any given time. (The compiler could reuse the space for <code>myObject[1000]</code> for <code>x</code> and the second <code>myObject[1000]</code> - and probably would do so for an optimised build, but in that case it would also completely remove these variables as they are not used, so it wouldn't show very well)

C++: How does the compiler know how much memory to allocate for each stack frame?

Tags:

c++

memory-management

memory

In the first answer here, the following was mentioned about the stack memory in C++:

When a function is called, a block is reserved on the top of the stack for local variables and some bookkeeping data.

This makes perfect sense on the top-level, and makes me curious about how smart compilers are when allocating this memory in and of itself, given the context of this question: Since braces themselves are not a stack frame in C (I assume this holds true for C++ as well), I want to check whether compilers optimize reserved memory based on variable scopes within a single function.

In the following I'm assuming that the stack looks like this before a function call:

--------
|main()|
-------- <- stack pointer: space above it is used for current scope
|      |
|      |
|      |
|      |
--------

And then the following after invoking a function f():

--------
|main()|
-------- <- old stack pointer (osp)
|  f() |
-------- <- stack pointer, variables will now be placed between here and osp upon reaching their declarations
|      |
|      |
|      |
|      |
--------

For example, given this function

void f() {
  int x = 0;
  int y = 5;
  int z = x + y;
}

Presumably, this will just allocate 3*sizeof(int) + some extra overhead for bookkeeping.

However, what about this function:

void g() {
  for (int i = 0; i < 100000; i++) {
    int x = 0;
  }
  {
    MyObject myObject[1000];
  }
  {
    MyObject myObject[1000];
  }
}

Ignoring compiler optimizations which may elide a lot of stuff in the above since really they do nothing, I'm curious about the following in the second example:

For the for loop: will the stack space be large enough to fit all 100000 ints?
On top of that, will the stack space contain 1000*sizeof(MyObject) or 2000*sizeof(MyObject)?

In general: does the compiler take variable scope into account when determining how much memory it will need for the new stack frame, before invoking a certain function? If this is compiler-specific, how do some well-known compilers do it?

556

asked Mar 21 '16 11:03

Jimmy

1 Answers

The compiler will allocate space as needed (typically for all items at the beginning of the function), but not for each iteration in the loop.

For example, what Clang produces, as LLVM-IR

define void @_Z1gv() #0 {
  %i = alloca i32, align 4
  %x = alloca i32, align 4
  %myObject = alloca [1000 x %class.MyObject], align 16
  %myObject1 = alloca [1000 x %class.MyObject], align 16
  store i32 0, i32* %i, align 4
  br label %1

; <label>:1:                                      ; preds = %5, %0
  %2 = load i32, i32* %i, align 4
  %3 = icmp slt i32 %2, 100000
  br i1 %3, label %4, label %8

; <label>:4:                                      ; preds = %1
  store i32 0, i32* %x, align 4
  br label %5

; <label>:5:                                      ; preds = %4
  %6 = load i32, i32* %i, align 4
  %7 = add nsw i32 %6, 1
  store i32 %7, i32* %i, align 4
  br label %1

; <label>:8:                                      ; preds = %1
  ret void
}

This is the result of:

class MyObject
{
public:
    int x, y;
};

void g() {
  for (int i = 0; i < 100000; i++) 
  {
    int x = 0; 
  } 
  {
    MyObject myObject[1000]; 
  } 
  {
    MyObject myObject[1000]; 
  } 
}

So, as you can see, x is allocated only once, not 100000 times. Because only ONE of those variables will exist at any given time.

(The compiler could reuse the space for myObject[1000] for x and the second myObject[1000] - and probably would do so for an optimised build, but in that case it would also completely remove these variables as they are not used, so it wouldn't show very well)

answered Sep 27 '22 21:09

Mats Petersson

Related questions
                            
                                How to include needed headers automatically in QT Creator
                            
                                Constness of template member in const member function
                            
                                C++14 auto lambda can accept Obj<std::tuple<void> > -- but template functions cannot?
                            
                                C++ integer conversion rules
                            
                                Include Visual Studio solution in CMake
                            
                                Does Qt5 have any class for similar to StringBuilder or StringBuffer?
                            
                                stringstream.rdbuf causing cout to fail
                            
                                Constexpr decltype
                            
                                c++ socket: size of the structure addrinfo
                            
                                Can I print an object of the class using printf()?
                            
                                c++ Large eigendecomposition speed
                            
                                Read Write lock implementation in C++
                            
                                Get windows Quiet hours from Win32 or C# API
                            
                                C++ CSV line with commas and strings within double quotes
                            
                                How to implement the RVExtension function for an ArmA 3 DLL in Rust?
                            
                                C++ Calling overloaded operator from within a class
                            
                                Does sizeof(T) * CHAR_BIT guarantee bit size?
                            
                                Why is the size of an array passed to a function by reference known to the compiler in C++?
                            
                                Running .exe without copying .dlls
                            
                                Gtest: Expected Class-Name Before '{'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With