My executables are linked with many static libraries, typically between 50 and 100 archives on Linux. Occasionally there are dependency cycles in these archives. The order that these libraries appear on the link command line is significant, see here. Attempting to manually order this many libraries is time-consuming at minimum, especially when there are cycles present.
Question: is there a utility or technique that will analyze a code base and produce a correct link command line ordering?
On Windows, the order of static libraries does matter if a symbol is defined in more than one library. On Linux, it matters when one static library references another.
Order does matter when you use several options of the same kind; for example, if you specify -L more than once, the directories are searched in the order specified. Many options have long names starting with -f or with -W--for example, -fforce-mem, -fstrength-reduce, -Wformat and so on.
Because of these advantages gcc compiles programs to use shared libraries by default on most systems, if they are available. Whenever a static library 'lib NAME . a' would be used for linking with the option -l NAME the compiler first checks for an alternative shared library with the same name and a '. so' extension.
You've answered your own question: yes, the order of objects and libraries on the link line does matter.
You want a topological sort.
The tsort
program will do that, but you'll need to do more work to use it [be prepared to write a perl/python script]. Also, there is another way as well. And, I will get to the "howto" below as I've done this sort of thing before.
The short answer: Use
--start-group
liblist--end-group
and be done with it.
For a few reasons:
An ld group is smart. It doesn't just loop on the files. It makes an initial pass through the group, but remembers the symbols. So, on subsequent passes it uses the cached symbol table information, so it's very fast.
For complex interactions, you may not be able to get rid of all the cycles with a toposort, so you'll still need a group even if liblist has been topo sorted.
Just how much time are we talking about? And, how much time do you think will be saved? How will you measure things to prove you really need this.
Go for the gold
Instead of using ld
, consider using ld.gold
. It has been rewritten from scratch to not use libbfd [which is slow] and operates on ELF files directly. The primary motivation for creating it was simplicity and speed.
How to topologically sort a library list
If we do info coreutils
, the tsort section will give an example of how to toposort a symbol table.
But, before we can get to that, we'll need to get the symbols. For a .a
file, nm
can provide the list: nm -go <liblist>
.
The output will look like:
libbfd.a:
libbfd.a:archive.o:0000000000000790 T _bfd_add_bfd_to_archive_cache
libbfd.a:archive.o: U bfd_alloc
libbfd.a:archive.o:0000000000000c20 T _bfd_append_relative_path
libbfd.a:archive.o: U bfd_assert
libbfd.a:archive.o: U bfd_bread
libbfd.a:archive.o:00000000000021b0 T _bfd_bsd44_write_ar_hdr
libbfd.a:archive.o: U strcpy
libbfd.a:archive.o: U strlen
libbfd.a:archive.o: U strncmp
libbfd.a:archive.o: U strncpy
libbfd.a:archive.o: U strtol
libbfd.a:archive.o: U xstrdup
libbfd.a:bfd.o: U __asprintf_chk
libbfd.a:bfd.o:00000000000002b0 T _bfd_abort
libbfd.a:bfd.o:0000000000000e40 T bfd_alt_mach_code
libbfd.a:bfd.o: U bfd_arch_bits_per_address
libbfd.a:bfd.o:0000000000000260 T bfd_assert
libbfd.a:bfd.o:0000000000000000 D _bfd_assert_handler
libbfd.a:bfd.o:0000000000000450 T bfd_canonicalize_reloc
libbfd.a:bfd.o: U bfd_coff_get_comdat_section
libbfd.a:bfd.o:0000000000000510 T _bfd_default_error_handler
libbfd.a:bfd.o:0000000000000fd0 T bfd_demangle
libbfd.a:bfd.o: U memcpy
libbfd.a:bfd.o: U strchr
libbfd.a:bfd.o: U strlen
libbfd.a:opncls.o:0000000000000a50 T bfd_openr
libbfd.a:opncls.o:0000000000001100 T bfd_openr_iovec
libbfd.a:opncls.o:0000000000000b10 T bfd_openstreamr
libbfd.a:opncls.o:0000000000000bb0 T bfd_openw
libbfd.a:opncls.o:0000000000001240 T bfd_release
libbfd.a:opncls.o: U bfd_set_section_contents
libbfd.a:opncls.o: U bfd_set_section_size
libbfd.a:opncls.o:0000000000000000 B bfd_use_reserved_id
libbfd.a:opncls.o:00000000000010d0 T bfd_zalloc
libbfd.a:opncls.o:00000000000011d0 T bfd_zalloc2
libglib-2.0.a:
libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000100 T g_allocator_free
libglib-2.0.a:libglib_2_0_la-gallocator.o:00000000000000f0 T g_allocator_new
libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000150 T g_blow_chunks
libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000160 T g_list_push_allocator
libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000060 T g_mem_chunk_alloc
libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000090 T g_mem_chunk_alloc0
libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000110 T g_mem_chunk_clean
libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000120 T g_mem_chunk_reset
libglib-2.0.a:libglib_2_0_la-gallocator.o:00000000000001b0 T g_node_pop_allocator
libglib-2.0.a:libglib_2_0_la-gallocator.o:00000000000001a0 T g_node_push_allocator
libglib-2.0.a:libglib_2_0_la-gallocator.o: U g_return_if_fail_warning
libglib-2.0.a:libglib_2_0_la-gallocator.o: U g_slice_alloc
libglib-2.0.a:libglib_2_0_la-gallocator.o: U g_slice_alloc0
libglib-2.0.a:libglib_2_0_la-gallocator.o: U g_slice_free1
libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000190 T g_slist_pop_allocator
libglib-2.0.a:libglib_2_0_la-gslice.o: U g_private_get
libglib-2.0.a:libglib_2_0_la-gslice.o: U g_private_set
libglib-2.0.a:libglib_2_0_la-gslice.o: U g_return_if_fail_warning
libglib-2.0.a:libglib_2_0_la-gslice.o:00000000000010d0 T g_slice_alloc
libglib-2.0.a:libglib_2_0_la-gslice.o:0000000000001770 T g_slice_alloc0
libglib-2.0.a:libglib_2_0_la-gslice.o:00000000000017a0 T g_slice_copy
libglib-2.0.a:libglib_2_0_la-gslice.o:00000000000017e0 T g_slice_free1
libglib-2.0.a:libglib_2_0_la-gslice.o:0000000000001ae0 T g_slice_free_chain_with_offset
So, the syntax will be:
<libname.a>:<objname.o>:<address> [TDB] <symbol>
<libname.a>:<objname.o>: U <symbol>
and we'll need to extract libname.a, symbol type (e.g. T, D, B, U), and the symbol.
We create a list of files. In each file struct, we remember all symbols and their types. Any type that is not U
[undefined symbol] will define the symbol.
Note that as we build the symbol table, a library may have multiple U's [in various .o's] that refer to a symbol defined by another .o within it. So, we only record the symbol once and if we see a non-U type, we "promote" it (e.g. if we saw U foo
and later saw T foo
we change the type of foo to T
[likewise for D and B].
Now we traverse the file list (e.g. curfile
). For each symbol in the file's symbol table, if it's of type U
[undefined], we scan all files looking for a non-U symbol definition. If we find one (in symfile
(e.g.)), we can output a dependency line for tsort: <curfile> <symfile>
. We repeat this for all files and symbols.
Note that this is bit wasteful because we could output many file dependency lines that are identical because the above will generate a line for each symbol. So, we should keep track of the lines output and only output a dependency line for unique file pairs. Also, note, it is possible to have both foo bar
and bar foo
. That is, actually, a cycle. While we just want one copy of foo bar
and/or bar foo
, they should not exclude one another.
Okay, so now feed the output of the above to tsort
and it will give us the topologically sorted version of liblist that we want.
As should be obvious, the script parsing can take some time, so the tsort output should be cached in a file, and rebuilt in a makefile, based upon a dependency list of liblist
Convert some .a files to .o files
If a given library uses all [or most] of the its .o files, instead of doing ar rv libname.a ...
, consider doing ld -r libname.o ...
.
This is similar in approach to creating a shared library .so file, but the "big" .o can still be statically linked.
Now, you have a single .o that will link faster than the .a because the intra-library links have already been resolved. Also, it will help with dependency cycles a bit.
A slight extension to the topo script could tell you which libraries are good candidates for this.
Even if the normal build makefiles can't be changed, the "final" top level could take a .a, either extract it into .o's, or use an ld force load option with -r to get the "big" .o
The BSD world have a wrapper around nm
to do the job of producing a tsort
-appropriate input called lorder
. It's really just a shell script, and you can see the source code in FreeBSD. It does exactly what the existing answer says, but in the form of two files for symbol and reference. (Also with some poor handling of whitespace in input filenames, but I digress.)
lorder
is ancient. It's been there since AT&T Unix version 7 (1979).
As for the suggestion to use the more modern ld.gold
for speed, also consider using LLVM's ld.lld
. LLD is easier to use since it does not require a good link order at all. And it's faster still.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With