Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

help understanding differences between #define, const and enum in C and C++ on assembly level

Tags:

c

x86

gcc

assembly

recently, i am looking into assembly codes for #define, const and enum:

C codes(#define):

3   #define pi 3  
4   int main(void)
5   {
6      int a,r=1;             
7      a=2*pi*r;
8      return 0;
9   }

assembly codes(for line 6 and 7 in c codes) generated by GCC:

6   mov $0x1, -0x4(%ebp)
7   mov -0x4(%ebp), %edx
7   mov %edx, %eax
7   add %eax, %eax
7   add %edx, %eax
7   add %eax, %eax
7   mov %eax, -0x8(%ebp)

C codes(enum):

2   int main(void)
3   {
4      int a,r=1;
5      enum{pi=3};
6      a=2*pi*r;
7      return 0;
8   }

assembly codes(for line 4 and 6 in c codes) generated by GCC:

6   mov $0x1, -0x4(%ebp)
7   mov -0x4(%ebp), %edx
7   mov %edx, %eax
7   add %eax, %eax
7   add %edx, %eax
7   add %eax, %eax
7   mov %eax, -0x8(%ebp)

C codes(const):

4   int main(void)
5   {
6      int a,r=1;  
7      const int pi=3;           
8      a=2*pi*r;
9      return 0;
10  }

assembly codes(for line 7 and 8 in c codes) generated by GCC:

6   movl $0x3, -0x8(%ebp)
7   movl $0x3, -0x4(%ebp)
8   mov  -0x4(%ebp), %eax
8   add  %eax, %eax
8   imul -0x8(%ebp), %eax
8   mov  %eax, 0xc(%ebp)

i found that use #define and enum, the assembly codes are the same. The compiler use 3 add instructions to perform multiplication. However, when use const, imul instruction is used. Anyone knows the reason behind that?

like image 207
martin Avatar asked Mar 08 '10 10:03

martin


1 Answers

The difference is that with #define or enum the value 3 doesn't need to exist as an explicit value in the code and so the compiler has decided to use two add instructions rather than allocating space for the constant 3. The add reg,reg instruction is 2 bytes per instruction, so thats 6 bytes of instructions and 0 bytes for constants to multiply by 3, that's smaller code than imul plus a 4 byte constant. Plus the way the add instructions are used, it works out to a pretty literal translation of *2 *3, so this may not be a size optimization, it may be the default compiler output whenever you multiply by 2 or by 3. (add is usually a faster instruction than multiply).

#define and enum don't declare an instance, they only provide a way to give a symbolic name to the value 3, so the compiler has the option of making smaller code.

  mov $0x1, -0x4(%ebp)    ; r=1
  mov -0x4(%ebp), %edx    ; edx = r
  mov %edx, %eax          ; eax = edx
  add %eax, %eax          ; *2
  add %edx, %eax          ; 
  add %eax, %eax          ; *3
  mov %eax, -0x8(%ebp)    ; a = eax

But when you declare const int pi = 3, you tell the compiler to allocate space for an integer value and initialize it with 3. That uses 4 bytes, but the constant is now available to use as an operand for the imul instruction.

 movl $0x3, -0x8(%ebp)     ; pi = 3
 movl $0x3, -0x4(%ebp)     ; r = 3? (typo?)
 mov  -0x4(%ebp), %eax     ; eax = r
 add  %eax, %eax           ; *2
 imul -0x8(%ebp), %eax     ; *pi
 mov  %eax, 0xc(%ebp)      ; a = eax

By the way, this is clearly not optimized code. Because the value a is never used, so if optimization were turned on, the compiler would just execute

xor eax, eax  ; return 0

In all 3 cases.

Addendum:

I tried this with MSVC and in debug mode I get the same output for all 3 cases, MSVC always uses imul by a literal 6. Even in case 3 when it creates the const int = 3 it doesn't actually reference it in the imul.

I don't think this test really tells you anything about const vs define vs enum because this is non-optimized code.

like image 178
John Knoeller Avatar answered Sep 25 '22 23:09

John Knoeller