Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does the WHNF reduction in Haskell happen at Compile time?

AFAIU the object file generated by Haskell compiler should be machine code. So does that object file have a representation of the original AST and reduces it at run time or does this reduction happen at compile time and only final WHNF values are converted to the corresponding machine code.

I understand the compilation time of latter would be a function of the time complexity of program itself which I think is less likely.

Can someone give a clear explanation of what happens at run time and what happens at compile time in the case of Haskell (GHC)?

like image 790
abhishek Avatar asked Mar 08 '23 12:03

abhishek


1 Answers

A compiler could perform its job performing all the reduction at runtime. That is, the resulting executable could have a (large) data section, where the whole program AST is encoded, and a (small) text/code section with a generic WHNF reducer which operates on the AST.

Note that the above approach would work in any language. E.g. a Python compiler could also generate an executable file comprising AST data and generic reducer. The reducer would follow the so-called small-step semantics of the language, which is a very well known notion in computer science (more specifically, in programming languages theory).

However, the performance of such approach would be quite poor.

Researchers in programming languages worked on finding better approaches, resulting in the definition of abstract machines. Essentially, an abstract machine is an algorithm for running an high level program in a lower level setting. Usually, it exploits a few data structures (e.g. stacks) to make the process more efficient. In the case of functional languages such as Haskell, well known abstract machines include:

  • Categorical Abstract Machine (Caml originally used this, I think)
  • Krivine Machine
  • SECD
  • Spineless Tagless G-machine (GHC uses this one)

The problem itself is far from being trivial. There has been, and I'd say there still is, research on making WHNF reduction more efficient.

Each Haskell definition, after GHC compilation, becomes a sequence of assembly instructions, which manipulate the state of the STG machine. There is no AST around, only code which manipulates data / closures / etc.

One could say that it is very important to use such advanced techniques to improve performance, coupled with heavy optimizations. A negative consequence, though, is that it becomes hard to understand the performance of the resulting code from the original one, since one needs to take into account how the abstract machine works (which is non trivial) and the optimizations (which are quite complex nowadays). To some lesser extent, this is also the case for heavily optimizing C or C++ compilers, where it becomes harder to know when the optimization was triggered or not.

Ultimately, an experienced programmer (in Haskell, C, C++, or anything else) will come to understand the basic optimizations of their compiler, and the basic mechanisms abstract machine which is being used. However, this is not something which is easy to master, I think.

In the question, it is mentioned that WHNF reduction could be performed at compile time. This is only partially true, since the value of the variables originating from IO actions can not be known until runtime, so reduction can only happen at runtime when it involves those values. Further, performing reduction can also make performance worse! E.g.

let x = complex computation in x + x
-- vs
complex computation + complex computation

The latter is the result of reducing the former, but it duplicates work! Indeed, most abstract machines use a lazy reduction approach that makes x to be computed only once in such cases.

like image 149
chi Avatar answered Apr 30 '23 11:04

chi