Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why my C++ output executable is so big?

Tags:

c++

boost

I have a rather simple C++ project, which uses boost::regex library. The output I'm getting is 3.5Mb in size. As I understand I'm statically linking all boost .CPP files, including all functions/methods. Maybe it's possible somehow to instruct my linker to use only necessary elements from boost, not all of them? Thanks.

$ c++ —version
i686-apple-darwin10-g++-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5659)

This is what size says:

$ size a.out
__TEXT  __DATA  __OBJC  others  dec hex
1556480 69632   0   4296504912  4298131024  100304650

I tried strip:

$ ls -al
...  3946688 May 21 13:20 a.out
$ strip a.out
$ ls -al
...  3847248 May 21 13:20 a.out

ps. This is how my code is organized (maybe this is the main cause of the problem):

// file MyClass.h
class MyClass {
  void f();
};
#include "MyClassImpl.h"

// file MyClassImpl.h
void MyClass::f() {
  // implementation...
}

// file main.cpp
#include "MyClass.h"
int main(int ac, char** av) {
  MyClass c;
  c.f();
}

What do you think?

like image 255
yegor256 Avatar asked May 15 '10 06:05

yegor256


1 Answers

Did you compile with debugging symbols enabled? That could account for a large portion of the size. Also how are you determining the size of the binary? Assuming you're on a UNIX-like platform are you using a straight "ls -l" or the "size" command. The two could give greatly different results if the binary contains debugging symbols. For example, here are the results I get when building the Boost.Regex "credit_card_example.cpp" example.

$ g++ -g -O3 foo.cpp -lboost_regex-mt

$ ls -l a.out 
-rwxr-xr-x 1 void void 483801 2010-05-20 10:36 a.out

$ size a.out
   text    data     bss     dec     hex filename
  73330     492     336   74158   121ae a.out

Similar results occur when just generating the object file:

$ g++ -c -g -O3 foo.cpp

$ ls -l foo.o 
-rw-r--r-- 1 void void 622476 2010-05-20 10:40 foo.o

$ size foo.o
   text    data     bss     dec     hex filename
  49119       4      40   49163    c00b foo.o

EDIT: Added some static linking results ...

Here's the binary size when statically linking. It's closer to what you're getting:

$ g++ -static -g -O3 foo.cpp -lboost_regex-mt -lpthread

$ ls -l a.out 
-rwxr-xr-x 1 void void 2019905 2010-05-20 11:16 a.out

$ size a.out 
   text    data     bss     dec     hex filename
1204517    5184   41976 1251677  13195d a.out

It's also possible that much of the large size is coming from other libraries the Boost.Regex library depends on. On my Ubuntu box, the dependencies for the Boost.Regex shared library are the following:

$ ldd /usr/lib/libboost_regex-mt.so.1.38.0 
        linux-gate.so.1 =>  (0x0053f000)
        libicudata.so.40 => /usr/lib/libicudata.so.40 (0xb6a38000)
        libicui18n.so.40 => /usr/lib/libicui18n.so.40 (0x009e0000)
        libicuuc.so.40 => /usr/lib/libicuuc.so.40 (0x00672000)
        librt.so.1 => /lib/tls/i686/cmov/librt.so.1 (0x001e2000)
        libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x001eb000)
        libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0x00110000)
        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x009be000)
        libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0x00153000)
        libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0x002dd000)
        /lib/ld-linux.so.2 (0x00e56000)

The ICU libraries can get quite large. Besides debugging symbols, perhaps they are the primary contributors to the size of your binary. Furthermore, in the statically linked case, it looks like the Boost.Regex library itself is comprised of large object files:

$ size --totals /usr/lib/libboost_regex-mt.a | sort -n
      0       0       0       0       0 regex_debug.o (ex /usr/lib/libboost_regex-mt.a)
      0       0       0       0       0 usinstances.o (ex /usr/lib/libboost_regex-mt.a)
      0       0       0       0       0 w32_regex_traits.o (ex /usr/lib/libboost_regex-mt.a)
   text    data     bss     dec     hex filename
    435       0       0     435     1b3 regex_raw_buffer.o (ex /usr/lib/libboost_regex-mt.a)
    480       0       0     480     1e0 static_mutex.o (ex /usr/lib/libboost_regex-mt.a)
   1543       0      36    1579     62b cpp_regex_traits.o (ex /usr/lib/libboost_regex-mt.a)
   3171     632       0    3803     edb regex_traits_defaults.o (ex /usr/lib/libboost_regex-mt.a)
   5339       8      13    5360    14f0 c_regex_traits.o (ex /usr/lib/libboost_regex-mt.a)
   5650       8      16    5674    162a wc_regex_traits.o (ex /usr/lib/libboost_regex-mt.a)
   9075       4      32    9111    2397 regex.o (ex /usr/lib/libboost_regex-mt.a)
  17052       8       4   17064    42a8 fileiter.o (ex /usr/lib/libboost_regex-mt.a)
  61265       0       0   61265    ef51 wide_posix_api.o (ex /usr/lib/libboost_regex-mt.a)
  61787       0       0   61787    f15b posix_api.o (ex /usr/lib/libboost_regex-mt.a)
  80811       8       0   80819   13bb3 icu.o (ex /usr/lib/libboost_regex-mt.a)
 116489       8     112  116609   1c781 instances.o (ex /usr/lib/libboost_regex-mt.a)
 117874       8     112  117994   1ccea winstances.o (ex /usr/lib/libboost_regex-mt.a)
 131104       0       0  131104   20020 cregex.o (ex /usr/lib/libboost_regex-mt.a)
 612075     684     325  613084   95adc (TOTALS)

You could get up to ~600K coming from Boost.Regex alone if some or all of those object files get linked into your binary.

like image 143
Void Avatar answered Sep 29 '22 04:09

Void