Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clarification on optimizing Single-constructor datatypes in GHC

Tags:

haskell

ghc

I was reading about how to optimize my Haskell code and came across a note about Single-constructor datatypes in GHC.

Excerpt:

GHC loves single-constructor datatypes, such as tuples. A single-constructor datatype can be unpacked when it is passed to a strict function. For example, given this function:

f (x,y) = ...

GHC's strictness analyser will detect that f is strict in its argument, and compile the function like this:

f z = case z of (x,y) -> f' x y
f' x y = ...

where f is called the wrapper, and f' is called the worker. The wrapper is inlined everywhere, so for example if you had a call to f like this:

... f (3,4) ...

this will end up being compiled to

... f' 3 4 ...

and the tuple has been completely optimised away.

Does this mean I should go through my program and wrap up all function arguments into one tuple? I don't really see how this is an optimization when the tuple gets unwrapped anyway.

Is this an alternative to the INLINE pragma? Should I use both? Only one? Is one better?

like image 791
paperduck Avatar asked Jun 27 '19 07:06

paperduck


1 Answers

I don't really see how this is an optimization when the tuple gets unwrapped anyway.

That is the optimisation: that the tuple gets unwrapped. IOW, the program that eventually runs never contains a tuple at all, it just contains a two-argument function call.

One might also put this in more pessimistic terms: ab initio, tuples are inherently bad for performance. That's because a tuple argument requires three pointer indirections: a thunk for the entire tuple, a thunk for the fst element, and a thunk for the snd element. So in principle, for performance it's a very bad idea to wrap your data into tuples. (Better put it in data structs with strict fields.) That is, of course, unless you really do need laziness at all of these spots.

However, and that's what the quote is all about, in practice it's generally still fine to use tuples in GHC, because it can usually optimise the indirection away if it can prove that it isn't actually needed.

like image 76
leftaroundabout Avatar answered Nov 18 '22 18:11

leftaroundabout