I have been looking at LLVM lately, and I find it to be quite an interesting architecture. However, looking through the tutorial and the reference material, I can't see any examples of how I might implement a string data type.
There is a lot of documentation about integers, reals, and other number types, and even arrays, functions and structures, but AFAIK nothing about strings. Would I have to add a new data type to the backend? Is there a way to use built-in data types? Any insight would be appreciated.
StringRef - Represent a constant reference to a string, i.e. More... #include "llvm/ADT/StringRef.h"
LLVM is designed around a language-independent intermediate representation (IR) that serves as a portable, high-level assembly language that can be optimized with a variety of transformations over multiple passes. LLVM.
Tradeoff #4: LLVM and poor LLVM IR generationrustc uses LLVM to generate code. LLVM can generate very fast code, but it comes at a cost. LLVM is a very big system. In fact, LLVM code makes up the majority of the Rust codebase.
How A LLVM Compiler Works. On the front end, the LLVM compiler infrastructure uses clang — a compiler for programming languages C, C++ and CUDA — to turn source code into an interim format. Then the LLVM clang code generator on the back end turns the interim format into final machine code.
What is a string? An array of characters.
What is a character? An integer.
So while I'm no LLVM expert by any means, I would guess that if, eg, you wanted to represent some 8-bit character set, you'd use an array of i8 (8-bit integers), or a pointer to i8. And indeed, if we have a simple hello world C program:
#include <stdio.h> int main() { puts("Hello, world!"); return 0; }
And we compile it using llvm-gcc and dump the generated LLVM assembly:
$ llvm-gcc -S -emit-llvm hello.c $ cat hello.s ; ModuleID = 'hello.c' target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128" target triple = "x86_64-linux-gnu" @.str = internal constant [14 x i8] c"Hello, world!\00" ; <[14 x i8]*> [#uses=1] define i32 @main() { entry: %retval = alloca i32 ; <i32*> [#uses=2] %tmp = alloca i32 ; <i32*> [#uses=2] %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0] %tmp1 = getelementptr [14 x i8]* @.str, i32 0, i64 0 ; <i8*> [#uses=1] %tmp2 = call i32 @puts( i8* %tmp1 ) nounwind ; <i32> [#uses=0] store i32 0, i32* %tmp, align 4 %tmp3 = load i32* %tmp, align 4 ; <i32> [#uses=1] store i32 %tmp3, i32* %retval, align 4 br label %return return: ; preds = %entry %retval4 = load i32* %retval ; <i32> [#uses=1] ret i32 %retval4 } declare i32 @puts(i8*)
Notice the reference to the puts function declared at the end of the file. In C, puts is
int puts(const char *s)
In LLVM, it is
i32 @puts(i8*)
The correspondence should be clear.
As an aside, the generated LLVM is very verbose here because I compiled without optimizations. If you turn those on, the unnecessary instructions disappear:
$ llvm-gcc -O2 -S -emit-llvm hello.c $ cat hello.s ; ModuleID = 'hello.c' target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128" target triple = "x86_64-linux-gnu" @.str = internal constant [14 x i8] c"Hello, world!\00" ; <[14 x i8]*> [#uses=1] define i32 @main() nounwind { entry: %tmp2 = tail call i32 @puts( i8* getelementptr ([14 x i8]* @.str, i32 0, i64 0) ) nounwind ; <i32> [#uses=0] ret i32 0 } declare i32 @puts(i8*)
[To follow up on other answers which explain what strings are, here is some implementation help]
Using the C interface, the calls you'll want are something like:
LLVMValueRef llvmGenLocalStringVar(const char* data, int len) { LLVMValueRef glob = LLVMAddGlobal(mod, LLVMArrayType(LLVMInt8Type(), len), "string"); // set as internal linkage and constant LLVMSetLinkage(glob, LLVMInternalLinkage); LLVMSetGlobalConstant(glob, TRUE); // Initialize with string: LLVMSetInitializer(glob, LLVMConstString(data, len, TRUE)); return glob; }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With