Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Local variable location in memory

Tags:

c

arm

objdump

For a homework assignment I have been given some c files, and compiled them using arm-linux-gcc (we will eventually be targeting gumstix boards, but for these exercises we have been working with qemu and ema).

One of the questions confuses me a bit-- we are told to:

Use arm-linux-objdump to find the location of variables declared in main() in the executable binary.

However, these variables are local and thus shouldn't have addresses until runtime, correct?

I'm thinking that maybe what I need to find is the offset in the stack frame, which can in fact be found using objdump (not that I know how).

Anyways, any insight into the matter would be greatly appreciated, and I would be happy to post the source code if necessary.

like image 732
bkane521 Avatar asked Mar 02 '13 22:03

bkane521


People also ask

Where are global and local variables stored in memory?

Global variables are stored in the data segment of memory. Local variables are stored in a stack in memory. We cannot declare many variables with the same name.

Where are local variables allocated?

The stack is used for dynamic memory allocation, and local variables are stored at the top of the stack in a stack frame. A frame pointer is used to refer to local variables in the stack frame.

Where local variables are stored in memory in C?

Local variables (declared and defined in functions) --------> stack. Variables declared and defined in main function -----> heap. Pointers (for example, char *arr , int *arr ) -------> heap. Dynamically allocated space (using malloc and calloc) --------> stack.

Where are variables stored in the computer memory?

In addition to the computer’s main memory, it is also possible to temporarily store a few variables directly in memory locations called registers which are a part of the computer’s CPU. 5.5.1. Storage classes ¶ Every variable declaration has 3 attributes. type (int, float, double, char, ...), discussed in Topic 1.

Where objects methods and variables are stored in memory in Java?

Where objects, methods and variables are stored in memory in Java? There are five main memory areas which are used to various Java elements. Following is the list of the same. Class Area - This area contains the static members of the class.

How is memory allocated in a modern programming language?

Most modern architectures act mostly the same way; block-scope variables and function arguments will be allocated from the stack, file-scope and static variables will be allocated from a data or code segment, dynamic memory will be allocated from a heap, some constant data will be stored in read-only segments, etc. Show activity on this post.

Which variables do not need to be stored in memory?

a variable with automatic storage duration that does not have its address taken need not be stored in memory at all. An example would be a loop variable. a variable that is const or effectively const need not be in memory.


2 Answers

unsigned int one ( unsigned int, unsigned int );
unsigned int two ( unsigned int, unsigned int );
unsigned int myfun ( unsigned int x, unsigned int y, unsigned int z )
{
    unsigned int a,b;
    a=one(x,y);
    b=two(a,z);
    return(a+b);
}

compile and disassemble

arm-none-eabi-gcc -c fun.c -o fun.o
arm-none-eabi-objdump -D fun.o

code created by compiler

00000000 <myfun>:
   0:   e92d4800    push    {fp, lr}
   4:   e28db004    add fp, sp, #4
   8:   e24dd018    sub sp, sp, #24
   c:   e50b0010    str r0, [fp, #-16]
  10:   e50b1014    str r1, [fp, #-20]
  14:   e50b2018    str r2, [fp, #-24]
  18:   e51b0010    ldr r0, [fp, #-16]
  1c:   e51b1014    ldr r1, [fp, #-20]
  20:   ebfffffe    bl  0 <one>
  24:   e50b0008    str r0, [fp, #-8]
  28:   e51b0008    ldr r0, [fp, #-8]
  2c:   e51b1018    ldr r1, [fp, #-24]
  30:   ebfffffe    bl  0 <two>
  34:   e50b000c    str r0, [fp, #-12]
  38:   e51b2008    ldr r2, [fp, #-8]
  3c:   e51b300c    ldr r3, [fp, #-12]
  40:   e0823003    add r3, r2, r3
  44:   e1a00003    mov r0, r3
  48:   e24bd004    sub sp, fp, #4
  4c:   e8bd4800    pop {fp, lr}
  50:   e12fff1e    bx  lr

Short answer is the memory is "allocated" both at compile time and at run time. At compile time in the sense that the compiler at compile time determines the size of the stack frame and who goes where. Run time in the sense that the memory itself is on the stack which is a dynamic thing. The stack frame is taken from stack memory at run time, almost like a malloc() and free().

It helps to know the calling convention, x enters in r0, y in r1, z in r2. then x has its home at fp-16, y at fp-20, and z at fp-24. then the call to one() needs x and y so it pulls those from the stack (x and y). the result of one() goes into a which is saved at fp-8 so that is the home for a. and so on.

the function one is not really at address 0, this is a disassembly of an object file not a linked binary. once an object is linked in with the rest of the objects and libraries, the missing parts, like where external functions are, are patched in by the linker and the calls to one() and two() will get real addresses. (and the program will likely not start at address 0).

I cheated here a little, I knew that with no optimizations enabled on the compiler and a relatively simple function like this there really is no reason for a stack frame:

compile with just a little optimization

arm-none-eabi-gcc -O1 -c fun.c -o fun.o
arm-none-eabi-objdump -D fun.o

and the stack frame is gone, the local variables remain in registers.

00000000 : 0: e92d4038 push {r3, r4, r5, lr} 4: e1a05002 mov r5, r2 8: ebfffffe bl 0 c: e1a04000 mov r4, r0 10: e1a01005 mov r1, r5 14: ebfffffe bl 0 18: e0800004 add r0, r0, r4 1c: e8bd4038 pop {r3, r4, r5, lr} 20: e12fff1e bx lr

what the compiler decided to do instead is give itself more registers to work with by saving them on the stack. Why it saved r3 is a mystery, but that is another topic...

entering the function r0 = x, r1 = y and r2 = z per the calling convention, we can leave r0 and r1 alone (try again with one(y,x) and see what happens) since they drop right into one() and are never used again. The calling convention says that r0-r3 can be destroyed by a function, so we need to preserve z for later so we save it in r5. The result of one() is r0 per the calling convention, since two() can destroy r0-r3 we need to save a for later, after the call to two() also we need r0 for the call to two anyway, so r4 now holds a. We saved z in r5 (was in r2 moved to r5) before the call to one, we need the result of one() as the first parameter to two(), and it is already there, we need z as the second so we move r5 where we had saved z to r1, then we call two(). the result of two() per the calling convention. Since b + a = a + b from basic math properties the final add before returning is r0 + r4 which is b + a, and the result goes in r0 which is the register used to return something from a function, per the convention. clean up the stack and restore the modified registers, done.

Since myfun() made calls to other functions using bl, bl modifies the link register (r14), in order to be able to return from myfun() we need the value in the link register to be preserved from the entry into the function to the final return (bx lr), so lr is pushed on the stack. The convention states that we can destroy r0-r3 in our function but not other registers so r4 and r5 are pushed on the stack because we used them. why r3 is pushed on the stack is not necessary from a calling convention perspective, I wonder if it was done in anticipation of a 64 bit memory system, making two full 64 bit writes is cheaper than one 64 bit write and one 32 bit right. but you would need to know the alignment of the stack going in so that is just a theory. There is no reason to preserve r3 in this code.

Now take this knowledge and disassemble the code assigned (arm-...-objdump -D something.something) and do the same kind of analysis. particularly with functions named main() vs functions not named main (I did not use main() on purpose) the stack frame can be a size that doesnt make sense, or less sense than other functions. In the non optimized case above we needed to store 6 things total, x,y,z,a,b and the link register 6*4 = 24 bytes which resulted in sub sp, sp, #24, I need to think about the stack pointer vs frame pointer thing for a bit. I think there is a command line argument to tell the compiler not to use a frame pointer. -fomit-frame-pointer and it saves a couple of instructions

00000000 <myfun>:
   0:   e52de004    push    {lr}        ; (str lr, [sp, #-4]!)
   4:   e24dd01c    sub sp, sp, #28
   8:   e58d000c    str r0, [sp, #12]
   c:   e58d1008    str r1, [sp, #8]
  10:   e58d2004    str r2, [sp, #4]
  14:   e59d000c    ldr r0, [sp, #12]
  18:   e59d1008    ldr r1, [sp, #8]
  1c:   ebfffffe    bl  0 <one>
  20:   e58d0014    str r0, [sp, #20]
  24:   e59d0014    ldr r0, [sp, #20]
  28:   e59d1004    ldr r1, [sp, #4]
  2c:   ebfffffe    bl  0 <two>
  30:   e58d0010    str r0, [sp, #16]
  34:   e59d2014    ldr r2, [sp, #20]
  38:   e59d3010    ldr r3, [sp, #16]
  3c:   e0823003    add r3, r2, r3
  40:   e1a00003    mov r0, r3
  44:   e28dd01c    add sp, sp, #28
  48:   e49de004    pop {lr}        ; (ldr lr, [sp], #4)
  4c:   e12fff1e    bx  lr

optimizing saves a whole lot more though...

like image 142
old_timer Avatar answered Oct 03 '22 05:10

old_timer


It's going to depend on the program and how exactly they want the location of the variables. Does the question want what code section they're stored in? .const .bss etc? Does it want specific addresses? Either way a good start is using objdump -S flag

objdump -S myprogram > dump.txt

This is nice because it will print out an intermixing of your source code and the assembly with addresses. From here just do a search for your int main and that should get you started.

like image 45
ThePosey Avatar answered Oct 03 '22 04:10

ThePosey