I guessed it but was still surprised to see that the output of these two programs, written in C and C++, when compiled were very different. That makes me think that the concept of objects still exist at even the lowest level. Does this add overhead? If so is it currently an impossible optimization to convert object oriented code to procedural style or just very hard to do?
#include <stdio.h>
int main(void) {
printf("Hello World!\n");
return 0;
}
#include <iostream>
int main() {
std::cout << "Hello World!" << std::endl;
return 0;
}
Compiled like this:
gcc helloworld.cpp -o hwcpp.S -S -O2
gcc helloworld.c -o hwc.S -S -O2
Produced this code:
.file "helloworld.c"
.section .rodata.str1.1,"aMS",@progbits,1
.LC0:
.string "Hello World!\n"
.text
.p2align 4,,15
.globl main
.type main, @function
main:
pushl %ebp
movl %esp, %ebp
andl $-16, %esp
subl $16, %esp
movl $.LC0, 4(%esp)
movl $1, (%esp)
call __printf_chk
xorl %eax, %eax
leave
ret
.size main, .-main
.ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
.section .note.GNU-stack,"",@progbits
.file "helloworld.cpp"
.text
.p2align 4,,15
.type _GLOBAL__I_main, @function
_GLOBAL__I_main:
.LFB1007:
.cfi_startproc
.cfi_personality 0x0,__gxx_personality_v0
pushl %ebp
.cfi_def_cfa_offset 8
movl %esp, %ebp
.cfi_offset 5, -8
.cfi_def_cfa_register 5
subl $24, %esp
movl $_ZStL8__ioinit, (%esp)
call _ZNSt8ios_base4InitC1Ev
movl $__dso_handle, 8(%esp)
movl $_ZStL8__ioinit, 4(%esp)
movl $_ZNSt8ios_base4InitD1Ev, (%esp)
call __cxa_atexit
leave
ret
.cfi_endproc
.LFE1007:
.size _GLOBAL__I_main, .-_GLOBAL__I_main
.section .ctors,"aw",@progbits
.align 4
.long _GLOBAL__I_main
.section .rodata.str1.1,"aMS",@progbits,1
.LC0:
.string "Hello World!"
.text
.p2align 4,,15
.globl main
.type main, @function
main:
.LFB997:
.cfi_startproc
.cfi_personality 0x0,__gxx_personality_v0
pushl %ebp
.cfi_def_cfa_offset 8
movl %esp, %ebp
.cfi_offset 5, -8
.cfi_def_cfa_register 5
andl $-16, %esp
pushl %ebx
subl $28, %esp
movl $12, 8(%esp)
movl $.LC0, 4(%esp)
movl $_ZSt4cout, (%esp)
.cfi_escape 0x10,0x3,0x7,0x55,0x9,0xf0,0x1a,0x9,0xfc,0x22
call _ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_i
movl _ZSt4cout, %eax
movl -12(%eax), %eax
movl _ZSt4cout+124(%eax), %ebx
testl %ebx, %ebx
je .L9
cmpb $0, 28(%ebx)
je .L5
movzbl 39(%ebx), %eax
.L6:
movsbl %al,%eax
movl %eax, 4(%esp)
movl $_ZSt4cout, (%esp)
call _ZNSo3putEc
movl %eax, (%esp)
call _ZNSo5flushEv
addl $28, %esp
xorl %eax, %eax
popl %ebx
movl %ebp, %esp
popl %ebp
ret
.p2align 4,,7
.p2align 3
.L5:
movl %ebx, (%esp)
call _ZNKSt5ctypeIcE13_M_widen_initEv
movl (%ebx), %eax
movl $10, 4(%esp)
movl %ebx, (%esp)
call *24(%eax)
jmp .L6
.L9:
call _ZSt16__throw_bad_castv
.cfi_endproc
.LFE997:
.size main, .-main
.local _ZStL8__ioinit
.comm _ZStL8__ioinit,1,1
.weakref _ZL20__gthrw_pthread_oncePiPFvvE,pthread_once
.weakref _ZL27__gthrw_pthread_getspecificj,pthread_getspecific
.weakref _ZL27__gthrw_pthread_setspecificjPKv,pthread_setspecific
.weakref _ZL22__gthrw_pthread_createPmPK14pthread_attr_tPFPvS3_ES3_,pthread_create
.weakref _ZL20__gthrw_pthread_joinmPPv,pthread_join
.weakref _ZL21__gthrw_pthread_equalmm,pthread_equal
.weakref _ZL20__gthrw_pthread_selfv,pthread_self
.weakref _ZL22__gthrw_pthread_detachm,pthread_detach
.weakref _ZL22__gthrw_pthread_cancelm,pthread_cancel
.weakref _ZL19__gthrw_sched_yieldv,sched_yield
.weakref _ZL26__gthrw_pthread_mutex_lockP15pthread_mutex_t,pthread_mutex_lock
.weakref _ZL29__gthrw_pthread_mutex_trylockP15pthread_mutex_t,pthread_mutex_trylock
.weakref _ZL31__gthrw_pthread_mutex_timedlockP15pthread_mutex_tPK8timespec,pthread_mutex_timedlock
.weakref _ZL28__gthrw_pthread_mutex_unlockP15pthread_mutex_t,pthread_mutex_unlock
.weakref _ZL26__gthrw_pthread_mutex_initP15pthread_mutex_tPK19pthread_mutexattr_t,pthread_mutex_init
.weakref _ZL29__gthrw_pthread_mutex_destroyP15pthread_mutex_t,pthread_mutex_destroy
.weakref _ZL30__gthrw_pthread_cond_broadcastP14pthread_cond_t,pthread_cond_broadcast
.weakref _ZL27__gthrw_pthread_cond_signalP14pthread_cond_t,pthread_cond_signal
.weakref _ZL25__gthrw_pthread_cond_waitP14pthread_cond_tP15pthread_mutex_t,pthread_cond_wait
.weakref _ZL30__gthrw_pthread_cond_timedwaitP14pthread_cond_tP15pthread_mutex_tPK8timespec,pthread_cond_timedwait
.weakref _ZL28__gthrw_pthread_cond_destroyP14pthread_cond_t,pthread_cond_destroy
.weakref _ZL26__gthrw_pthread_key_createPjPFvPvE,pthread_key_create
.weakref _ZL26__gthrw_pthread_key_deletej,pthread_key_delete
.weakref _ZL30__gthrw_pthread_mutexattr_initP19pthread_mutexattr_t,pthread_mutexattr_init
.weakref _ZL33__gthrw_pthread_mutexattr_settypeP19pthread_mutexattr_ti,pthread_mutexattr_settype
.weakref _ZL33__gthrw_pthread_mutexattr_destroyP19pthread_mutexattr_t,pthread_mutexattr_destroy
.ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
.section .note.GNU-stack,"",@progbits
Linking is the final stage of compilation. It takes one or more object files or libraries as input and combines them to produce a single (usually executable) file.
There are over 50 compilers for C like ICC by Intel to GNU GCC by GNU Project. The focus of having multiple compilers is to optimize the compiled C code for specific hardware and software environments.
A compiler takes the program code (source code) and converts the source code to a machine language module (called an object file). Another specialized program, called a linker, combines this object file with other previously compiled object files (in particular run-time modules) to create an executable file.
Compiled languages are converted directly into machine code that the processor can execute. As a result, they tend to be faster and more efficient to execute than interpreted languages.
Different compilers produce different code. An early version of gcc versus the current version of gcc likely produce different code.
More importantly, iostream
handles a lot of things stdio
doesn't, so there's obviously going to be some substantial overhead. I understand that, in theory, these could be compiled down to indentical code, but what they're doing is not technically identical.
Your issue here isn't about objects or optimization: it's that printf
and cout
are fundamentally very different beasts. For a more fair comparison, replace your cout
statement in the C++ code with printf
. Optimization is a moot point when you're outputting to stdout, as the bottleneck will certainly be the terminal's buffer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With