In C, let's say you have a variable called variable_name
. Let's say it's located at 0xaaaaaaaa
, and at that memory address, you have the integer 123. So in other words, variable_name
contains 123.
I'm looking for clarification around the phrasing "variable_name
is located at 0xaaaaaaaa
". How does the compiler recognize that the string "variable_name" is associated with that particular memory address? Is the string "variable_name" stored somewhere in memory? Does the compiler just substitute variable_name
for 0xaaaaaaaa
whenever it sees it, and if so, wouldn't it have to use memory in order to make that substitution?
The static variables are stored in the data segment of the memory. The data segment is a part of the virtual address space of a program. All the static variables that do not have an explicit initialization or are initialized to zero are stored in the uninitialized data segment( also known as the BSS segment).
How is a variable stored in a computer memory ? int x = 15; Typically the computer in its ram allocates a 4 bytes chunk of memory, storing the value 15 in the form of 0's and 1's in the allocated 4 bytes and x refers to the address of the 4 bytes allocated memory.
I want to know if variables are stored in "RAM" or Hard Disk? Variables are usually stored in Ram. the global variables are stored on the heap and variables declared in functions/methods are stored on stacks.
Typically local variables are put on the "stack". This means that the compiler assigns an offset to the "stack pointer" which can be different depending on the invocation of the current function. I.e. the compiler assumes that memory locations like Stack-Pointer+4, Stack-Pointer+8, etc.
A C compiler first creates a symbol table, which stores the relationship between the variable name and where it's located in memory. When compiling, it uses this table to replace all instances of the variable with a specific memory location, as others have stated. You can find a lot more on it on the Wikipedia page.
Variable names don't exist anymore after the compiler runs (barring special cases like exported globals in shared libraries or debug symbols). The entire act of compilation is intended to take those symbolic names and algorithms represented by your source code and turn them into native machine instructions. So yes, if you have a global variable_name
, and compiler and linker decide to put it at 0xaaaaaaaa
, then wherever it is used in the code, it will just be accessed via that address.
So to answer your literal questions:
How does the compiler recognize that the string "variable_name" is associated with that particular memory address?
The toolchain (compiler & linker) work together to assign a memory location for the variable. It's the compiler's job to keep track of all the references, and linker puts in the right addresses later.
Is the string
"variable_name"
stored somewhere in memory?
Only while the compiler is running.
Does the compiler just substitute
variable_name
for0xaaaaaaaa
whenever it sees it, and if so, wouldn't it have to use memory in order to make that substitution?
Yes, that's pretty much what happens, except it's a two-stage job with the linker. And yes, it uses memory, but it's the compiler's memory, not anything at runtime for your program.
An example might help you understand. Let's try out this program:
int x = 12; int main(void) { return x; }
Pretty straightforward, right? OK. Let's take this program, and compile it and look at the disassembly:
$ cc -Wall -Werror -Wextra -O3 example.c -o example $ otool -tV example example: (__TEXT,__text) section _main: 0000000100000f60 pushq %rbp 0000000100000f61 movq %rsp,%rbp 0000000100000f64 movl 0x00000096(%rip),%eax 0000000100000f6a popq %rbp 0000000100000f6b ret
See that movl
line? It's grabbing the global variable (in an instruction-pointer relative way, in this case). No more mention of x
.
Now let's make it a bit more complicated and add a local variable:
int x = 12; int main(void) { volatile int y = 4; return x + y; }
The disassembly for this program is:
(__TEXT,__text) section _main: 0000000100000f60 pushq %rbp 0000000100000f61 movq %rsp,%rbp 0000000100000f64 movl $0x00000004,0xfc(%rbp) 0000000100000f6b movl 0x0000008f(%rip),%eax 0000000100000f71 addl 0xfc(%rbp),%eax 0000000100000f74 popq %rbp 0000000100000f75 ret
Now there are two movl
instructions and an addl
instruction. You can see that the first movl
is initializing y
, which it's decided will be on the stack (base pointer - 4). Then the next movl
gets the global x
into a register eax
, and the addl
adds y
to that value. But as you can see, the literal x
and y
strings don't exist anymore. They were conveniences for you, the programmer, but the computer certainly doesn't care about them at execution time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With