Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is a C++ Hello World binary larger than the equivalent C binary?

Tags:

c++

c

size

In his FAQ, Bjarne Stroustrup says that when compiled with gcc -O2, the file size of a hello world using C and C++ are identical.

Reference: http://www.stroustrup.com/bs_faq.html#Hello-world

I decided to try this, here is the C version:

#include <stdio.h>

int main(int argc, char* argv[])
{
    printf("Hello world!\n");
    return 0;
}

And here is the C++ version

#include <iostream>

int main(int argc, char* argv[])
{
    std::cout << "Hello world!\n"; 
    return 0;
}

Here I compile, and the sizes are different:

r00t@wutdo:~/hello$ ls
hello.c  hello.cpp
r00t@wutdo:~/hello$ gcc -O2 hello.c -o c.out
r00t@wutdo:~/hello$ g++ -O2 hello.cpp -o cpp.out
r00t@wutdo:~/hello$ ls -l
total 32
-rwxr-xr-x 1 r00t r00t 8559 Sep  1 18:00 c.out
-rwxr-xr-x 1 r00t r00t 8938 Sep  1 18:01 cpp.out
-rw-r--r-- 1 r00t r00t   95 Sep  1 17:59 hello.c
-rw-r--r-- 1 r00t r00t  117 Sep  1 17:59 hello.cpp
r00t@wutdo:~/hello$ size c.out cpp.out
   text    data     bss     dec     hex filename
   1191     560       8    1759     6df c.out
   1865     608     280    2753     ac1 cpp.out

I replaced std::endl with \n and it made the binary smaller. I figured something this simple would be inlined, and am dissapointed it's not.

Also wow, the optimized assemblies have hundreds of lines of assembly output? I can write hello world with like 5 assembly instructions using sys_write, what's up with all the extra stuff? Why does C put some much extra on the stack to setup? I mean, like 50 bytes of assembly vs 8kb of C, why?

like image 862
Jack Avatar asked Oct 21 '25 05:10

Jack


2 Answers

You're looking at a mix of information that's easily misinterpreted. The 8559 and 8938 byte file sizes are largely meaningless since they're mostly headers with symbol names and other misc information for at least minimal debugging purposes. The somewhat meaningful numbers are the size(1) output you added later:

r00t@wutdo:~/hello$ size c.out cpp.out
   text    data     bss     dec     hex filename
   1191     560       8    1759     6df c.out
   1865     608     280    2753     ac1 cpp.out

You could get a more detailed breakdown by using the -A option to size, but in short, the differences here are fairly trivial.

What's more interesting is that Bjarne Stroustrup never mentioned whether he was talking about static or dynamic linking. In your case, both programs are dynamic-linked, so the size differences have nothing to do with the actual size cost of stdio or iostream; you're just measuring the cost of the calling code, or (more likely, based on the other comments/answer) the base overhead of exception-handling support for C++. Now, there is a common claim that a static-linked C++ iostream-based hello world can be even smaller than a printf-based one, since the compiler can see exactly which overloaded versions of operator<< are used and optimize out unneeded code (such as expensive floating point printing), whereas printf's use of format strings makes this difficult in the common case and impossible in general. However, I've never seen a C++ implementation where a static-linked iostream-based hello program could come anywhere near close to being as small as, much less smaller than, a printf-based one in C.

like image 179
R.. GitHub STOP HELPING ICE Avatar answered Oct 23 '25 21:10

R.. GitHub STOP HELPING ICE


I think he's treating the half kilobyte as a rounding error. Both are "9 kilobytes" and that's what you'll see in a typical file browser. They aren't exactly the same because, under the hood, the C and C++ libraries are quite different. If you're already familiar with your disassembler, you can see the details of the difference for yourself.

The "extra stuff" is for the sake of importing symbols from the standard library shlib, and handling C++ exceptions. Strangely enough, much of the GCC-compiled C executable is taken up by C++ exception handling tables. I've not figured out how to strip them using GCC.

endl is inlined, but it contains calls to print the \n character and flush the stream, which are not inlined. The difference in size is due to importing those from the standard library.

In truth, individual kilobytes seldom matter on any system with dynamically-loaded libraries. Self-contained code such as on an embedded system would need to include the standard library functionality it uses, and the C++ standard library tends to be heavier than its C counterpart — <iostream> vs. <stdio.h> in particular.

like image 20
Potatoswatter Avatar answered Oct 23 '25 21:10

Potatoswatter



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!