Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Analyzing an ELF binary to minimize its size

Tags:

gcc

linker

v8

arm

elf

I'm cross-compiling a V8 project to an embedded ARM target using the GCC arm-gnueabi cross compiler. I got the V8 library itself cross-compiled successfully, and as a smoke test I wanted to link it to Google's hello world example and run it on the ARM board.

The libraries themselves clock in at a bit over 1.2 MB:

v8 % find out/arm.release/obj.target/ -name '*.a' -exec du -h {} + 
1.2M    out/arm.release/obj.target/tools/gyp/libv8_base.a
12K     out/arm.release/obj.target/tools/gyp/libv8_libbase.a
4.0K    out/arm.release/obj.target/tools/gyp/libv8_libplatform.a
4.0K    out/arm.release/obj.target/tools/gyp/libv8_snapshot.a
4.0K    out/arm.release/obj.target/tools/gyp/libv8_nosnapshot.a
4.0K    out/arm.release/obj.target/third_party/icu/libicudata.a
164K    out/arm.release/obj.target/third_party/icu/libicuuc.a
336K    out/arm.release/obj.target/third_party/icu/libicui18n.

Yet when I build and link with

arm-linux-gnueabi-g++ -pthread -Iv8/include hi.cpp -Os -o hi_v8 -Wl,--start-group v8/out/arm.release/obj.target/{tools/gyp/libv8_{base,libbase,snapshot},third_party/icu/libicu{uc,i18n,data}}.a -Wl,--end-group

I get an executable that's 20 MB. Stripping it only gets me down to 17 MB

What is being linked in that balloons the file size so much? How can I avoid it? What tools can I use to diagnose the problem? This size may be problematic on the platform I am targeting.

I've taken a look at readelf --sections, but it just tells me how large the .text section is overall, which isn't particularly helpful. I also took a look at the suggestions here and tried using nm, but it's too specific - I just a bunch of name-mangled symbols like _ZN2v88internal11FLAG_log_gcE.

like image 492
Matt Kline Avatar asked Aug 26 '14 19:08

Matt Kline


1 Answers

First, if you haven't already, use size -A hi_v8 to determine what section or sections are bigger than you expect. It's not always the text section.

Next add -Wl,-Map,hi_v8.map to the g++ command line. This will generate a linker map in the file hi_v8.map. The contents of the file will be very verbose, but it'll will show the contribution of each of object files to each section in the executable.

The linker map will have a number of sections. The first section "Archive member included because of file (symbol)" is helpful for figuring what caused an object be linked into the executable, but not so much what's inflating the size of the executable. Once you figured that out, and it turns to be an errant library, you can come back to this section. The "Allocating common symbols", "Discarded input sections", "Memory Configuration" sections are probably not going to be very helpful.

The "Linker script and memory map" section is where you want to be focusing your attention. It's essentially a trace of the linker script used to produce the executable. First check the LOAD statements at the start, they show every file that the executable was linked with. Check to see if there's any files you didn't expect to see, however libraries will be mentioned here even none of their object files were linked into the executable.

Now you'll have to wade through the trace of every linked object file and its symbols being adding to every section of the executable. Since it's just a "Hello World" problem it shouldn't be too bad. Skip to the section that you've identified as being the problem. Now scan through the list of objects and see if you can find either where a large number of not obviously necessary object files are being linked in or where the address suddenly jumps by a large amount. The later should be relatively easy to spot, but the former can be hard to identify. It might help if you generate linker map for a statically linked "Hello World" program on your native platform to see the sort of library routines it links in.

My guess is that your problem will show up as either a large jump in the address, or some file that obviously that shouldn't be linked in. So don't be put off by the verboseness of the map file. The C++ symbols should also be demangled so it won't be as bad as nm. (Though you can get nm to also demangle the names with the -C option.)

like image 103
Ross Ridge Avatar answered Sep 27 '22 18:09

Ross Ridge