Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How are the names/memory addresses of variables represented at the bit level?

This may be a silly question, but I'm fairly new to programming so bear with me.

Lets say for the sake of argument I'm talking about coding in C...

I understand that (system dependent) an int takes up 4 bytes, or 32 bits of memory.

However, there are two things here that I find confusing. This piece of memory has a particular memory address associated with it (which lets say is also 32 bits) and if store this int in a variable, then it also has a name associated with it.

e.g. int myInt = 5;

My question is - how and where are the memory address and name of the variable represented at bit level? When compiling the code, does the compiler basically say: "Ok myInt refers to address 0xffffff" and essentially substitute in the memory address in the machine code? Even if this was the case, I'm still confused as to how the memory address itself is represented...

I hope where my confusion lies is clear enough!

like image 222
Daniel Graef Avatar asked Jan 31 '12 16:01

Daniel Graef


People also ask

How are memory addresses represented?

In computing, a memory address is a reference to a specific memory location used at various levels by software and hardware. Memory addresses are fixed-length sequences of digits conventionally displayed and manipulated as unsigned integers.

How are variables represented in memory?

Variables are usually stored in RAM. This is either on the heap (e.g. all global variables will usually go there) or on the stack (all variables declared within a method/function usually go there). Stack and Heap are both RAM, just different locations.

What is memory address of a variable?

The memory address is the location of where the variable is stored on the computer. When we assign a value to the variable, it is stored in this memory address.

How are variable names stored in memory?

The variables are stored in a symbol table. It also contains the set of bits or bytes required or allocated to the variable ie the memory address.


1 Answers

Theoretically, the answer is "it's implementation dependent". Each C compiler decides the best approach to take -- as long as it exhibits the correct behavior, it doesn't really matter how you got there.

Practically speaking, you probably won't see a reference to variable names in a compiled program. That's because the processor doesn't need to know or care about variable names you've assigned in order to execute your program. It operates on a much lower level: it simply needs to understand which instructions to execute and in which order. Therefore, including this information is an superfluous extravagance that would simply inflate the size of your program.

There are three basic cases, which I'll enumerate below. I'll also describe the behavior of what a typical compiler would probably do, although remember that it's implementation-dependent and in general compilers are allowed to do whatever they want as long as they exhibit the correct standard behavior.

  • For local variables, as part of the compilation process, the compiler replaces each variable name (e.g. "int foo;") with a reference to the appropriate memory allocated for that variable ("memory address 0x482c"). Assignments or references to this value can typically just refer to the appropriate address.

  • For non-local variables, the compiler performs additional resolution to determine how to discover, but it's usually still not storing the name of the variable or function directly.

  • For things that aren't variables at all (structs, methods, etc.), the compiler performs different resolution steps and optimizations based on decisions it's made and parameters that have been set. But here, too, it doesn't store the name of the variable or function; it's just replacing it with the appropriate references.

There are cases where it's useful to include variable names in the compiler's output, like when you're debugging. In those cases, a debugging symbol table is created. This is a way to map addresses to names that are meaningful for humans, so that when something crashes, you can see "Stacktrace: function foo() ..." instead of "Stacktrace: Memory address 0x4572 ...".

In languages with metaprogramming capabilities, like C#, Ruby, or Java, the names often become part of the metadata that's stored along with the object, rather than just replacing each reference with a lookup on a memory address. That metadata comes in handy when you want to do certain neat tricks that are nigh-impossible in C at runtime, like asking "which objects in this collection have a field named foo?".

like image 175
John Feminella Avatar answered Sep 21 '22 16:09

John Feminella