What is the difference between @code_native, @code_typed and @code_llvm in Julia?

Tags:

julia

While going through julia, I wanted to have a functionality similar to python's dis module. Going through over the net, I found out that the Julia community have worked over this issue and given these (https://github.com/JuliaLang/julia/issues/218)

finfer -> code_typed methods(function, types) -> code_lowered disassemble(function, types, true) -> code_native disassemble(function, types, false) -> code_llvm

I have tried these personally using the Julia REPL, but I quite seem to find it hard to understand.

In Python, I can disassemble a function like this.

>>> import dis >>> dis.dis(lambda x: 2*x)   1           0 LOAD_CONST               1 (2)               3 LOAD_FAST                0 (x)               6 BINARY_MULTIPLY                    7 RETURN_VALUE         >>>

Can anyone who has worked with these help me understand them more? Thanks.

739

asked Apr 17 '17 14:04

Rahul

1 Answers

The standard CPython implementation of Python parses source code and does some pre-processing and simplification of it – aka "lowering" – transforming it to a machine-friendly, easy-to-interpret format called "bytecode". This is what is displayed when you "disassemble" a Python function. This code is not executable by the hardware – it is "executable" by the CPython interpreter. CPython's bytecode format is fairly simple, partly because that's what interpreters tend to do well with – if the bytecode is too complex, it slows down the interpreter – and partly because the Python community tends to put a high premium on simplicity, sometimes at the cost of high performance.

Julia's implementation is not interpreted, it is just-in-time (JIT) compiled. This means that when you call a function, it is transformed to machine code which is executed directly by the native hardware. This process is quite a bit more complex than the parsing and lowering to bytecode that Python does, but in exchange for that complexity, Julia gets its hallmark speed. (The PyPy JIT for Python is also much more complex than CPython but also typically much faster – increased complexity is a fairly typical cost for speed.) The four levels of "disassembly" for Julia code give you access to the representation of a Julia method implementation for particular argument types at different stages of the transformation from source code to machine code. I'll use the following function which computes the next Fibonacci number after its argument as an example:

function nextfib(n)     a, b = one(n), one(n)     while b < n         a, b = b, a + b     end     return b end  julia> nextfib(5) 5  julia> nextfib(6) 8  julia> nextfib(123) 144

Lowered code. The @code_lowered macro displays code in a format that is the closest to Python byte code, but rather than being intended for execution by an interpreter, it's intended for further transformation by a compiler. This format is largely internal and not intended for human consumption. The code is transformed into "single static assignment" form in which "each variable is assigned exactly once, and every variable is defined before it is used". Loops and conditionals are transformed into gotos and labels using a single unless/goto construct (this is not exposed in user-level Julia). Here's our example code in lowered form (in Julia 0.6.0-pre.beta.134, which is just what I happen to have available):

julia> @code_lowered nextfib(123) CodeInfo(:(begin         nothing         SSAValue(0) = (Main.one)(n)         SSAValue(1) = (Main.one)(n)         a = SSAValue(0)         b = SSAValue(1) # line 3:         7:         unless b < n goto 16 # line 4:         SSAValue(2) = b         SSAValue(3) = a + b         a = SSAValue(2)         b = SSAValue(3)         14:         goto 7         16:  # line 6:         return b     end))

You can see the SSAValue nodes and unless/goto constructs and label numbers. This is not that hard to read, but again, it's also not really meant to be easy for human consumption. Lowered code doesn't depend on the types of the arguments, except in as far as they determine which method body to call – as long as the same method is called, the same lowered code applies.

Typed code. The @code_typed macro presents a method implementation for a particular set of argument types after type inference and inlining. This incarnation of the code is similar to the lowered form, but with expressions annotated with type information and some generic function calls replaced with their implementations. For example, here is the type code for our example function:

julia> @code_typed nextfib(123) CodeInfo(:(begin         a = 1         b = 1 # line 3:         4:         unless (Base.slt_int)(b, n)::Bool goto 13 # line 4:         SSAValue(2) = b         SSAValue(3) = (Base.add_int)(a, b)::Int64         a = SSAValue(2)         b = SSAValue(3)         11:         goto 4         13:  # line 6:         return b     end))=>Int64

Calls to one(n) have been replaced with the literal Int64 value 1 (on my system the default integer type is Int64). The expression b < n has been replaced with its implementation in terms of the slt_int intrinsic ("signed integer less than") and the result of this has been annotated with return type Bool. The expression a + b has been also replaced with its implementation in terms of the add_int intrinsic and its result type annotated as Int64. And the return type of the entire function body has been annotated as Int64.

Unlike lowered code, which depends only on argument types to determine which method body is called, the details of typed code depend on argument types:

julia> @code_typed nextfib(Int128(123)) CodeInfo(:(begin         SSAValue(0) = (Base.sext_int)(Int128, 1)::Int128         SSAValue(1) = (Base.sext_int)(Int128, 1)::Int128         a = SSAValue(0)         b = SSAValue(1) # line 3:         6:         unless (Base.slt_int)(b, n)::Bool goto 15 # line 4:         SSAValue(2) = b         SSAValue(3) = (Base.add_int)(a, b)::Int128         a = SSAValue(2)         b = SSAValue(3)         13:         goto 6         15:  # line 6:         return b     end))=>Int128

This is the typed version of the nextfib function for an Int128 argument. The literal 1 must be sign extended to Int128 and the result types of operations are of type Int128 instead of Int64. The typed code can be quite different if the implementation of a type is considerably different. For example nextfib for BigInts is significantly more involved than for simple "bits types" like Int64 and Int128:

julia> @code_typed nextfib(big(123)) CodeInfo(:(begin         $(Expr(:inbounds, false))         # meta: location number.jl one 164         # meta: location number.jl one 163         # meta: location gmp.jl convert 111         z@_5 = $(Expr(:invoke, MethodInstance for BigInt(), :(Base.GMP.BigInt))) # line 112:         $(Expr(:foreigncall, (:__gmpz_set_si, :libgmp), Void, svec(Ptr{BigInt}, Int64), :(&z@_5), :(z@_5), 1, 0))         # meta: pop location         # meta: pop location         # meta: pop location         $(Expr(:inbounds, :pop))         $(Expr(:inbounds, false))         # meta: location number.jl one 164         # meta: location number.jl one 163         # meta: location gmp.jl convert 111         z@_6 = $(Expr(:invoke, MethodInstance for BigInt(), :(Base.GMP.BigInt))) # line 112:         $(Expr(:foreigncall, (:__gmpz_set_si, :libgmp), Void, svec(Ptr{BigInt}, Int64), :(&z@_6), :(z@_6), 1, 0))         # meta: pop location         # meta: pop location         # meta: pop location         $(Expr(:inbounds, :pop))         a = z@_5         b = z@_6 # line 3:         26:         $(Expr(:inbounds, false))         # meta: location gmp.jl < 516         SSAValue(10) = $(Expr(:foreigncall, (:__gmpz_cmp, :libgmp), Int32, svec(Ptr{BigInt}, Ptr{BigInt}), :(&b), :(b), :(&n), :(n)))         # meta: pop location         $(Expr(:inbounds, :pop))         unless (Base.slt_int)((Base.sext_int)(Int64, SSAValue(10))::Int64, 0)::Bool goto 46 # line 4:         SSAValue(2) = b         $(Expr(:inbounds, false))         # meta: location gmp.jl + 258         z@_7 = $(Expr(:invoke, MethodInstance for BigInt(), :(Base.GMP.BigInt))) # line 259:         $(Expr(:foreigncall, ("__gmpz_add", :libgmp), Void, svec(Ptr{BigInt}, Ptr{BigInt}, Ptr{BigInt}), :(&z@_7), :(z@_7), :(&a), :(a), :(&b), :(b)))         # meta: pop location         $(Expr(:inbounds, :pop))         a = SSAValue(2)         b = z@_7         44:         goto 26         46:  # line 6:         return b     end))=>BigInt

This reflects the fact that operations on BigInts are pretty complicated and involve memory allocation and calls to the external GMP library (libgmp).

LLVM IR. Julia uses the LLVM compiler framework to generate machine code. LLVM defines an assembly-like language which it uses as a shared intermediate representation (IR) between different compiler optimization passes and other tools in the framework. There are three isomorphic forms of LLVM IR:

A binary representation that is compact and machine readable.
A textual representation that is verbose and somewhat human readable.
An in-memory representation that is generated and consumed by LLVM libraries.

Julia uses LLVM's C++ API to construct LLVM IR in memory (form 3) and then call some LLVM optimization passes on that form. When you do @code_llvm you see the LLVM IR after generation and some high-level optimizations. Here's LLVM code for our ongoing example:

julia> @code_llvm nextfib(123)  define i64 @julia_nextfib_60009(i64) #0 !dbg !5 { top:   br label %L4  L4:                                               ; preds = %L4, %top   %storemerge1 = phi i64 [ 1, %top ], [ %storemerge, %L4 ]   %storemerge = phi i64 [ 1, %top ], [ %2, %L4 ]   %1 = icmp slt i64 %storemerge, %0   %2 = add i64 %storemerge, %storemerge1   br i1 %1, label %L4, label %L13  L13:                                              ; preds = %L4   ret i64 %storemerge }

This is the textual form of the in-memory LLVM IR for the nextfib(123) method implementation. LLVM is not easy to read – it's not intended to be written or read by people most of the time – but it is thoroughly specified and documented. Once you get the hang of it, it's not hard to understand. This code jumps to the label L4 and initializes the "registers" %storemerge1 and %storemerge with the i64 (LLVM's name for Int64) value 1 (their values are derived differently when jumped to from different locations – that's what the phi instruction does). It then does a icmp slt comparing %storemerge with register %0 – which holds the argument untouched for the entire method execution – and saves the comparison result into the register %1. It does an add i64 on %storemerge and %storemerge1 and saves the result into register %2. If %1 is true, it branches back to L4 and otherwise it branches to L13. When the code loops back to L4 the register %storemerge1 gets the previous values of %storemerge and %storemerge gets the previous value of %2.

Native code. Since Julia executes native code, the last form a method implementation takes is what the machine actually executes. This is just binary code in memory, which is rather hard to read, so long ago people invented various forms of "assembly language" which represent instructions and registers with names and have some amount of simple syntax to help express what instructions do. In general, assembly language remains close to one-to-one correspondence with machine code, in particular, one can always "disassemble" machine code into assembly code. Here's our example:

julia> @code_native nextfib(123)     .section    __TEXT,__text,regular,pure_instructions Filename: REPL[1]     pushq   %rbp     movq    %rsp, %rbp     movl    $1, %ecx     movl    $1, %edx     nop L16:     movq    %rdx, %rax Source line: 4     movq    %rcx, %rdx     addq    %rax, %rdx     movq    %rax, %rcx Source line: 3     cmpq    %rdi, %rax     jl  L16 Source line: 6     popq    %rbp     retq     nopw    %cs:(%rax,%rax)

This is on an Intel Core i7, which is in the x86_64 CPU family. It only uses standard integer instructions, so it doesn't matter beyond that what the architecture is, but you can get different results for some code depending on the specific architecture of your machine, since JIT code can be different on different systems. The pushq and movq instructions at the beginning are a standard function preamble, saving registers to the stack; similarly, popq restores the registers and retq returns from the function; nopw is a 2-byte instruction that does nothing, included just to pad the length of the function. So the meat of the code is just this:

    movl    $1, %ecx     movl    $1, %edx     nop L16:     movq    %rdx, %rax Source line: 4     movq    %rcx, %rdx     addq    %rax, %rdx     movq    %rax, %rcx Source line: 3     cmpq    %rdi, %rax     jl  L16

The movl instructions at the top initialize registers with 1 values. The movq instructions move values between registers and the addq instruction adds registers. The cmpq instruction compares two registers and jl either jumps back to L16 or continues to return from the function. This handful of integer machine instructions in a tight loop is exactly what executes when your Julia function call runs, presented in slightly more pleasant human-readable form. It's easy to see why it runs fast.

If you're interested in JIT compilation in general as compared to interpreted implementations, Eli Bendersky has a great pair of blog posts where he goes from a simple interpreter implementation of a language to a (simple) optimizing JIT for the same language:

http://eli.thegreenplace.net/2017/adventures-in-jit-compilation-part-1-an-interpreter/
http://eli.thegreenplace.net/2017/adventures-in-jit-compilation-part-2-an-x64-jit.html

163

answered Oct 13 '22 13:10

StefanKarpinski

Related questions
                            
                                How to select elements from array in Julia matching predicate?
                            
                                How and When to Use @async and @sync in Julia
                            
                                What would Clojure lose by switching away from leading parenthesis like Dylan, Julia and Seph?
                            
                                `where` in function definitions in julia-0.6
                            
                                Within a Julia script, can you tell whether the script has been imported or executed directly?
                            
                                Converting integer to string in Julia
                            
                                What is the difference between fields and properties in Julia?
                            
                                How to read a file line by line in Julia?
                            
                                Best way to run Julia code in an IPython notebook (or Python code in an IJulia notebook)
                            
                                Why is Julia taking a long time on the first call into my module?
                            
                                How does one clear or remove a global in julia?
                            
                                Julia DataFrame: remove column by name
                            
                                How do I select a random item from a weighted array in Julia?
                            
                                Floating point math in different programming languages
                            
                                Search for files in a folder
                            
                                Initialize an array of arrays in Julia
                            
                                What does an exclamation mark mean after the name of a function?
                            
                                Julia request user input from script
                            
                                Determine version of a specific package
                            
                                Convert float to int in Julia Lang

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With