Possible Duplicate:
Small Haskell program compiled with GHC into huge binary
Recently I noticed how large Haskell executables are. Everything below was compiled on GHC 7.4.1 with -O2
on Linux.
Hello World (main = putStrLn "Hello World!"
) is over 800 KiB. Running strip
over it reduces the filesize to 500 KiB; even adding -dynamic
to the compilation doesn't help much, leaving me with a stripped executable around 400 KiB.
Compiling a very primitive example involving Parsec yields a 1.7 MiB file.
-- File: test.hs
import qualified Text.ParserCombinators.Parsec as P
import Data.Either (either)
-- Parses a string of type "x y" to the tuple (x,y).
testParser :: P.Parser (Char, Char)
testParser = do
a <- P.anyChar
P.char ' '
b <- P.anyChar
return (a, b)
-- Parse, print result.
str = "1 2"
main = print $ either (error . show) id . P.parse testParser "" $ str
-- Output: ('1','2')
Parsec may be a larger library, but I'm only using a tiny subset of it, and indeed the optimized core code generated by the above is dramatically smaller than the executable:
$ ghc -O2 -ddump-simpl -fforce-recomp test.hs | wc -c
49190 (bytes)
Therefore, it's not the case that a huge amount of Parsec is actually found in the program, which was my initial assumption.
Why are the executables of such an enormous size? Is there something I can do about it (except dynamic linking)?
Since Haskell executables are statically compiled by default, all the transitive dependencies are included in the output. This 103-megabyte binary is big enough that it takes a non-trivial amount of time to transfer over the wire if it needed to be downloaded or uploaded.
GHC itself takes 270 MB, and with all the libraries and utilities that come together it takes over 500 MB.
Regarding what Haskell is compiled to: As the documentation you quoted says, GHC compiles Haskell to native code. It can do so by either directly emitting native code or by first emitting LLVM code and then letting LLVM compile that to native code. Either way the end result of running GHC is a native executable.
The compiler (written in Haskell), translates Haskell to C, assembly, LLVM bitcode and other formats. The strategy it uses is described best here: Implementing lazy functional languages on stock hardware:the Spineless Tagless G-machine.
To effectively reduce size of the executable produced by Glasgow Haskell Compiler you have to focus on
-dynamic
option passed to ghc so modules code won't get bundled into the final executable by utilizing of shared(dynamic) libraries. The existence of shared versions of these GHC's libraries in the system is required !The simple hello world example has the final size 9 KiB and Parsec test about 28 KiB (both 64 bit Linux executables) which I find quite small and acceptable for such a high level language implementation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With