How is clang able to steer C/C++ code optimization?

Tags:

I was told that clang is a driver that works like gcc to do preprocessing, compilation and linkage work. During the compilation and linkage, as far as I know, it's actually llvm that does the optimization ("-O1", "-O2", "-O3", "-Os", "-flto").

But I just cannot understand how llvm is involved.

It seems that compiling source code doesn't even need a static library such as libLLVMCore.a, instead for debian clang packages depends on another package called libllvm-3.4(clang version is 3.4), which contains libLLVM-3.4.so(.1), does clang use this shared library for optimization?

I've checked clang source code for a while and found that include/clang/Driver/Options.td contains the related options, but unfortunately I failed to find the source files that include that file, so I'm still not aware of the mechanism.

I hope someone might give me some hints.

891

asked Nov 03 '14 13:11

Hongxu Chen

1 Answers

_{(TL;DontWannaRead - skip to the end of this answer)}

To answer your question properly you first need to understand the difference between a compiler's front-end and back-end (especially the first one).

Clang is a compiler front-end (http://en.wikipedia.org/wiki/Clang) for C, C++, Objective C and Objective C++ languages.

Clang's duty is the following:

enter image description here

i.e. translating from C++ source code (or C, or Objective C, etc..) to LLVM IR, a textual lower-level representation of what should that code do. In order to do this Clang employs a number of sub-modules whose descriptions you could find in any decent compiler construction book: lexer, parser + a semantic analyzer (Sema), etc..

LLVM is a set of libraries whose primary task is the following: suppose we have the LLVM IR representation of the following C++ function

int double_this_number(int num) {
    int result = 0;
    result = num;
    result = result * 2;
    return result;
}

the core of the LLVM passes should optimize LLVM IR code:

enter image description here

What to do with the optimized LLVM IR code is entirely up to you: you can translate it to x86_64 executable code or modify it and then spit it out as ARM executable code or GPU executable code. It depends on the goal of your project.

The term "back-end" is often confusing since there are many papers that would define the LLVM libraries a "middle end" in a compiler chain and define the "back end" as the final module which does the code generation (LLVM IR to executable code or something else which no longer needs processing by the compiler). Other sources refer to LLVM as a back end to Clang. Either way, their role is clear and they offer a powerful mechanism: whatever the language you're targeting (C++, C, Objective C, Python, etc..) if you have a front-end which translates it to LLVM IR, you can use the same set of LLVM libraries to optimize it and, as long as you have a back-end for your target architecture, you can generate optimized executable code.

Recalling that LLVM is a set of libraries (not just optimization passes but also data structures, utility modules, diagnostic modules, etc..), Clang also leverages many LLVM libraries during its front-ending process. You can't really tear every LLVM module away from Clang since the latter is built on the former set.

As for the reason why Clang is said to be a "compilation driver": Clang manages interpreting the command line parameters (descriptions and many declarations are TableGen'd and they might require a bit more than a simple grep to swim through the sources), decides which Jobs and phases are to be executed, set up the CodeGenOptions according to the desired/possible optimization and transformation levels and invokes the appropriate modules (clangCodeGen in BackendUtil.cpp is the one that populates a module pass manager with the optimizations to apply) and tools (e.g. the Windows ld linker). It steers the compilation process from the very beginning to the end.

Finally I would suggest reading Clang and LLVM documentation, they're pretty explicative and most of your questions should look for an answer there in the first place.

answered Sep 22 '22 17:09

Marco A.

Related questions
                            
                                How should I indent do nothing initialization list constructors? [closed]
                            
                                Intel TBB will work on AMD processors? [duplicate]
                            
                                Is there a limit to the length of identifier names in C++?
                            
                                How does one safely static_cast between unsigned int and int?
                            
                                difference between two SYSTEMTIME variable
                            
                                sum of small double numbers c++
                            
                                How to remove black part from the image?
                            
                                Why is decltype not allowed on private member variables?
                            
                                Open file with fopen, given absolute path on Windows
                            
                                A private variable can be accessed from another object of the same type? [duplicate]
                            
                                What's the difference between doing vector<vector<T...>> and vector<vector<T>...>
                            
                                Functional programming in C/C++?
                            
                                C++ Win32 Console Color
                            
                                Why can't my WndProc be in a class?
                            
                                Calling a constructor of the base class from a subclass' constructor body
                            
                                shorthand syntax for c++ map in map
                            
                                What should I use instead of cl::KernelFunctor?
                            
                                Smart pointers with SDL
                            
                                How to avoid big memory allocation with std::make_shared
                            
                                C++: OpenCV: fast pixel iteration

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How is clang able to steer C/C++ code optimization?

Tags:

c++

gcc

compilation

llvm

clang

Hongxu Chen

People also ask

1 Answers

Marco A.

Recent Activity

Donate For Us