Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where is const data stored?

Tags:

c++

For example:

In the file demo.c,

#inlcude<stdio.h>
int a = 5;
int main(){
  int b=5;
  int c=a;
  printf("%d", b+c);
  return 0;
}

For int a = 5, does the compiler translate this into something like store 0x5 at the virtual memory address, for example, Ox0000000f in the const area so that for int c = a, it is translated to something like movl 0x0000000f %eax?

Then for int b = 5, the number 5 is not put into the const area, but translated directly to a immediate in the assembly instruction like mov $0x5 %ebx.

like image 476
Gab是好人 Avatar asked Oct 30 '22 05:10

Gab是好人


2 Answers

It depends. Your program has several constants:

int a = 5;

This is a "static" initialization (which occurs when the program text and data is loaded before running). The value is stored in the memory reserved by a which is in a read-write data "program section". If something changes a, the value 5 is lost.

int b=5;

This is a local variable with limited scope (only by main()). The storage could well be a CPU register or a location on the stack. The instructions generated for most architectures will place the value 5 in an instruction as "immediate data", for an x86 example:

mov   eax, 5

The ability for instructions to hold arbitrary constants is limited. Small constants are supported by most CPU instructions. "Large" constants are not usually directly supported. In that case the compiler would store the constant in memory and load it instead. For example,

       .psect  rodata
k1     dd      3141592653
       .psect  code
       mov     eax  k1

The ARM family has a powerful design for loading most constants directly: any 8-bit constant value can be rotated any even number of times. See this page 2-25.

One not-as-obvious but totally different item is in the statement:

printf("%d", b+c);

The string %d is, by modern C semantics, a constant array of three char. Most modern implementations will store it in read-only memory so that attempts to change it will cause a SEGFAULT, which is a low level CPU error which usually causes the program to instantly abort.

       .psect  rodata
s1     db      '%', 'd', 0
       .psect  code
       mov     eax  s1
       push    eax
like image 178
wallyk Avatar answered Nov 17 '22 00:11

wallyk


In OP's program, a is an "initialized" "global". I expect that it is placed in the initialized part of the data segment. See https://en.wikipedia.org/wiki/File:Program_memory_layout.pdf, http://www.cs.uleth.ca/~holzmann/C/system/memorylayout.gif (from more info on Memory layout of an executable program (process)). The location of a is decided by the compiler- linker duo.

On the other hand, being automatic (stack) variables, b and c are expected in the stack segment.

Being said that, the compiler/linker has the liberty to perform any optimization as long as the observed behavior is not violated (What exactly is the "as-if" rule?). For example, if a is never referenced, then it may be optimized out completely.

like image 31
Arun Avatar answered Nov 16 '22 23:11

Arun