From C Programming Language by Brian W. Kernighan
& operator only applies to objects in memory: variables and array elements. It cannot be applied to expressions, constants or register variables.
Where are expressions and constants stored if not in memory? What does that quote mean?
E.g:&(2 + 3)
Why can't we take its address? Where is it stored?
Will the answer be same for C++ also since C has been its parent?
This linked question explains that such expressions are rvalue
objects and all rvalue
objects do not have addresses.
My question is where are these expressions stored such that their addresses can't be retrieved?
Constants are stored in various places: Constants may be stored in a designated section of memory that is marked read-only for the program. In general-purpose systems, this is not ROM. ROM is physically read-only memory.
'const' variable is stored on stack. 'const' is a compiler directive in "C". The compiler throws error when it comes across a statement changing 'const' variable.
As per the memory layout of C program ,constant variables are stored in the Initialized data segment of the RAM. But as per some of the Microcontroller memory layout ,const variables are stored in FLASH Memory.
Global variables are stored in the data section. Unlike the stack, the data region does not grow or shrink — storage space for globals persists for the entire run of the program. Finally, the heap portion of memory is the part of a program's address space associated with dynamic memory allocation.
Consider the following function:
unsigned sum_evens (unsigned number) { number &= ~1; // ~1 = 0xfffffffe (32-bit CPU) unsigned result = 0; while (number) { result += number; number -= 2; } return result; }
Now, let's play the compiler game and try to compile this by hand. I'm going to assume you're using x86 because that's what most desktop computers use. (x86 is the instruction set for Intel compatible CPUs.)
Let's go through a simple (unoptimized) version of how this routine could look like when compiled:
sum_evens: and edi, 0xfffffffe ;edi is where the first argument goes xor eax, eax ;set register eax to 0 cmp edi, 0 ;compare number to 0 jz .done ;if edi = 0, jump to .done .loop: add eax, edi ;eax = eax + edi sub edi, 2 ;edi = edi - 2 jnz .loop ;if edi != 0, go back to .loop .done: ret ;return (value in eax is returned to caller)
Now, as you can see, the constants in the code (0
, 2
, 1
) actually show up as part of the CPU instructions! In fact, 1
doesn't show up at all; the compiler (in this case, just me) already calculates ~1
and uses the result in the code.
While you can take the address of a CPU instruction, it often makes no sense to take the address of a part of it (in x86 you sometimes can, but in many other CPUs you simply cannot do this at all), and code addresses are fundamentally different from data addresses (which is why you cannot treat a function pointer (a code address) as a regular pointer (a data address)). In some CPU architectures, code addresses and data addresses are completely incompatible (although this is not the case of x86 in the way most modern OSes use it).
Do notice that while (number)
is equivalent to while (number != 0)
. That 0
doesn't show up in the compiled code at all! It's implied by the jnz
instruction (jump if not zero). This is another reason why you cannot take the address of that 0
— it doesn't have one, it's literally nowhere.
I hope this makes it clearer for you.
where are these expressions stored such that there addresses can't be retrieved?
Your question is not well-formed.
It's like asking why people can discuss ownership of nouns but not verbs. Nouns refer to things that may (potentially) be owned, and verbs refer to actions that are performed. You can't own an action or perform a thing.
Expressions are not stored in the first place, they are evaluated. They may be evaluated by the compiler, at compile time, or they may be evaluated by the processor, at run time.
Consider the statement
int a = 0;
This does two things: first, it declares an integer variable a
. This is defined to be something whose address you can take. It's up to the compiler to do whatever makes sense on a given platform, to allow you to take the address of a
.
Secondly, it sets that variable's value to zero. This does not mean an integer with value zero exists somewhere in your compiled program. It might commonly be implemented as
xor eax,eax
which is to say, XOR (exclusive-or) the eax
register with itself. This always results in zero, whatever was there before. However, there is no fixed object of value 0
in the compiled code to match the integer literal 0
you wrote in the source.
As an aside, when I say that a
above is something whose address you can take - it's worth pointing out that it may not really have an address unless you take it. For example, the eax
register used in that example doesn't have an address. If the compiler can prove the program is still correct, a
can live its whole life in that register and never exist in main memory. Conversely, if you use the expression &a
somewhere, the compiler will take care to create some addressable space to store a
's value in.
Note for comparison that I can easily choose a different language where I can take the address of an expression.
It'll probably be interpreted, because compilation usually discards these structures once the machine-executable output replaces them. For example Python has runtime introspection and code
objects.
Or I can start from LISP and extend it to provide some kind of addressof operation on S-expressions.
The key thing they both have in common is that they are not C, which as a matter of design and definition does not provide those mechanisms.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With