Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Towards understanding CodeGen* in the Haskell LLVM bindings

Tags:

Background: I am writing a toy Lisp interperter/compiler in Haskell for my own amusement/edification. I am trying to add the ability to compile to LLVM bytecode.

Context: I have been reading the documentation for LLVM.Core and a code example (here) attempting to understand the means of combination and means of abstraction (as described in Abelson and Sussman Structure and Interpretation of Computer Programs.) used in the Haskell LLVM bindings. There are a lot of small pieces and I am not clear how they are intended to work together. It seems like there is a level of abstraction above the basic LLVM machine instructions that is obvious to someone with lots of experience with LLVM, but not documented for those, like me, who are just getting their feet wet.

Question: What are CodeGenModule and CodeGenFunction and how are they used to build up Functions and Modules?

like image 488
John F. Miller Avatar asked Jun 15 '11 18:06

John F. Miller


1 Answers

The Module and Function types are just thin wrappers around pointers to the corresponding C++ objects (that is, Module* and Value*):

-- LLVM.Core.Util newtype Module = Module {       fromModule :: FFI.ModuleRef     }     deriving (Show, Typeable)  type Function a = Value (Ptr a)      newtype Value a = Value { unValue :: FFI.ValueRef }     deriving (Show, Typeable)  -- LLVM.FFI.Core data Module     deriving (Typeable) type ModuleRef = Ptr Module  data Value     deriving (Typeable) type ValueRef = Ptr Value 

The CodeGenModule and CodeGenFunction types are parts of the EDSL built on top of the LLVM.FFI.* modules. They use Function, Module and the functions from LLVM.FFI.* internally and allow you to write LLVM IR in Haskell concisely using do-notation (example taken from Lennart Augustsson's blog):

mFib :: CodeGenModule (Function (Word32 -> IO Word32)) mFib = do     fib <- newFunction ExternalLinkage     defineFunction fib $ \ arg -> do         -- Create the two basic blocks.         recurse <- newBasicBlock         exit <- newBasicBlock          [...]         ret r     return fib 

You can think of CodeGenModule as an AST representing a parsed LLVM assembly file (.ll). Given a CodeGenModule, you can e.g. write it to a .bc file:

-- newModule :: IO Module mod <- newModule -- defineModule :: Module -> CodeGenModule a -> IO a defineModule mod $ do [...]  -- writeBitcodeToFile :: FilePath -> Module -> IO () writeBitcodeToFile "mymodule.bc" mod  --- Alternatively, just use this function from LLVM.Util.File: writeCodeGenModule :: FilePath -> CodeGenModule a -> IO ()  

I also recommend you to acquaint yourself with core classes of LLVM, since they also show through in the Haskell API.

like image 196
Mikhail Glushenkov Avatar answered Oct 12 '22 13:10

Mikhail Glushenkov