How do linkers decide what parts of libraries to include?

Tags:

Assume library A has a() and b(). If I link my program B with A and call a(), does b() get included in the binary? Does the compiler see if any function in the program call b() (perhaps a() calls b() or another lib calls b())? If so, how does the compiler get this information? If not, isn't this a big waste of final compile size if I'm linking to a big library but only using a minor feature?

717

asked Apr 03 '09 20:04

Oliver Zheng

4 Answers

Take a look at link-time optimization. This is necessarily vendor dependent. It will also depend how you build your binaries. MS compilers (2005 onwards at least) provide something called Function Level Linking -- which is another way of stripping symbols you don't need. This post explains how the same can be achieved with GCC (this is old, GCC must've moved on but the content is relevant to your question).

Also take a look at the LLVM implementation (and the examples section).

I suggest you also take a look at Linkers and Loaders by John Levine -- an excellent read.

answered Dec 26 '22 20:12

dirkgently

It depends.

If the library is a shared object or DLL, then everything in the library is loaded, but at run time. The cost in extra memory is (hopefully) offset by sharing the library (really, the code pages) between all the processes in memory that use that library. This is a big win for something like libc.so, less so for myreallyobscurelibrary.so. But you probably aren't asking about shared objects, really.

Static libraries are a simply a collection of individual object files, each the result of a separate compilation (or assembly), and possibly not even written in the same source language. Each object file has a number of exported symbols, and almost always a number of imported symbols.

The linker's job is to create a finished executable that has no remaining undefined imported symbols. (I'm lying, of course, if dynamic linking is allowed, but bear with me.) To do that, it starts with the modules named explicitly on the link command line (and possibly implicitly in its configuration) and assumes that any module named explicitly must be part of the finished executable. It then attempts to find definitions for all of the undefined symbols.

Usually, the named object modules expect to get symbols from some library such as libc.a.

In your example, you have a single module that calls the function a(), which will result in the linker looking for module that exports a().

You say that the library named A (on unix, probably libA.a) offers a() and b(), but you don't specify how. You implied that a() and b() do not call each other, which I will assume.

If libA.a was built from a.o and b.o where each defines the corresponding single function, then the linker will include a.o and ignore b.o.

However, if libA.a included ab.o that defined both a() and b() then it will include ab.o in the link, satisfying the need for a(), and including the unused function b().

As others have mentioned, there are linkers that are capable of splitting individual functions out of modules, and including only those that are actually used. In many cases, that is a safe thing to do. But it is usually safest to assume that your linker does not do that unless you have specific documentation.

Something else to be aware of is that most linkers make as few passes as they can through the files and libraries that are named on the command line, and build up their symbol table as they go. As a practical matter, this means that it is good practice to always specify libraries after all of the object modules on the link command line.

answered Dec 26 '22 19:12

RBerteig

It depends on the linker.

eg. Microsoft Visual C++ has an option "Enable function level linking" so you can enable it manually.

(I assume they have a reason for not just enabling it all the time...maybe linking is slower or something)

answered Dec 26 '22 19:12

Jimmy J

Usually (static) libraries are composed of objects created from source files. What linkers usually do is include the object if a function that is provided by that object is referenced. if your source file only contains one function than only that function will be brought in by the linker. There are more sophisticated linkers out there but most C based linkers still work like outlined. There are tools available that split C source that contain multiple functions into artificially smaller source files to make static linking more fine granular.

If you are using shared libraries then you don't impact you compiled size by using more or less of them. However your runtime size will include them.

answered Dec 26 '22 19:12

lothar

Related questions
                            
                                Error: add explicit braces to avoid dangling else. C
                            
                                c99 - error: unknown type name ‘pid_t’
                            
                                How many ways are there to pass char array to function in C?
                            
                                How to run C language code on JetBrains CLion
                            
                                Copying elements from one character array to another
                            
                                Extra characters added to beginning of string?
                            
                                Check if a number is +-Inf or NaN
                            
                                In C, How to use scanf to scan in 1,000,000 ignoring the commas
                            
                                What do fully buffered, line buffered and unbuffered mean in C? [closed]
                            
                                Resize CGSize to the maximum with keeping the aspect-ratio
                            
                                When typedef is used in C, will it create a new type or only a type name?
                            
                                How to compile C/C++ application statically with FreeImage library in Linux (Ubuntu)?
                            
                                Why is the return type of the "ceil()" function "double" instead of some integer type?
                            
                                Cordova project and windows platform
                            
                                Can still print a string after I freed it?
                            
                                Load of misaligned address and UBsan finding
                            
                                Is the following C union access pattern undefined behavior?
                            
                                How can I calculate 2^n for large n?
                            
                                How to safely escape a string from C++
                            
                                Fast way to determine if a PID exists on (Windows)?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How do linkers decide what parts of libraries to include?

Tags:

c

linker

Oliver Zheng

People also ask

4 Answers

dirkgently

RBerteig

Jimmy J

lothar

Recent Activity

Donate For Us