Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LLVM code generator: is Machine code representation machine-agnostic?

Please note: this question is not about LLVM IR, but LLVM's MIR, an internal intermediate representation lower than the former one.

This documentation on LLVM Machine code description classes, says (highlighting mine):

At the high-level, LLVM code is translated to a machine specific representation formed out of MachineFunction , MachineBasicBlock , and MachineInstr instances...

However, the same paragraph goes on and says:

This representation is completely target agnostic, representing instructions in their most abstract form...

My question is, how to understand this paragraph?

I have a hard time reconciling the claim that this intermediate representation is machine specific and target agnostic at the same time. I thought "machine" and "target", in LLVM's context, mean the same thing - the instruction set architecture (e.g. x86_64, MIPS) used by the compiled executable.

Examples are welcome.

like image 389
Leedehai Avatar asked Sep 30 '18 17:09

Leedehai


People also ask

Does LLVM compile to machine code?

LLVM can also generate relocatable machine code at compile-time or link-time or even binary machine code at run-time. LLVM supports a language-independent instruction set and type system.

What is LLVM code?

LLVM is a library that is used to construct, optimize and produce intermediate and/or binary machine code. LLVM can be used as a compiler framework, where you provide the "front end" (parser and lexer) and the "back end" (code that converts LLVM's representation to actual machine code).

What is LLVM intermediate representation?

The core of LLVM is the intermediate representation (IR). Front ends compile code from a source language to the IR, optimization passes transform the IR, and code generators turn the IR into native code. LLVM provides three isomorphic representations of the IR.

What is LLVM IR code?

LLVM IR is a low-level intermediate representation used by the LLVM compiler framework. You can think of LLVM IR as a platform-independent assembly language with an infinite number of function local registers.


1 Answers

There are different ways to be platform specific. For instance, you could have a differently-named opcode for add, or perhaps with different overflow semantics, or you could use the same add for all, with the operands/flags specified by the same arguments for all target platforms, with the same default values.

And there are many target-specific details such as the size or alignment of pointers that affect your code even if they don't affect any single instruction.

Machine IR represents instructions in their most abstract form. It doesn't try hide that on this target, pointers have 32 bits.

like image 189
arnt Avatar answered Jan 02 '23 12:01

arnt