Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell: How "cache" friendly is Lazy Eval / call by need

I have been studying Haskell in my spare time for a couple of months now. I'm wondering how Haskell performs on the current stock hardware, in regards to the memory sub-system (L1, L2, L3 cache). Can someone please point me to any report/study on how cache friendly is Haskell because of its Lazy evaluation/call-by-need? Is there a way for us to get the info as to how many data cache misses and instruction cache misses happened and see if this is due to the lazy evaluation nature of the language?

Thanks.

like image 276
user3169543 Avatar asked Dec 24 '15 17:12

user3169543


People also ask

How does Haskell lazy evaluation work?

Lazy evaluation is a method to evaluate a Haskell program. It means that expressions are not evaluated when they are bound to variables, but their evaluation is deferred until their results are needed by other computations.

Does Haskell have lazy evaluation?

Haskell is a lazy language, meaning that it employs lazy evaluation . Before explaining lazy evaluation , let's first explain strict evaluation which most readers will likely be more familiar with. Strict evaluation means that arguments to functions are evaluated prior to being passed to the functions.

Is lazy evaluation good?

Lazy evaluation's is not always better. The performance benefits of lazy evaluation can be great, but it is not hard to avoid most unnecessary evaluation in eager environments- surely lazy makes it easy and complete, but rarely is unnecessary evaluation in code a major problem.

Why lazy evaluation is good?

The benefits of lazy evaluation include: The ability to define control flow (structures) as abstractions instead of primitives. The ability to define potentially infinite data structures. This allows for more straightforward implementation of some algorithms.


1 Answers

This is primary a question of runtime value representation. Most types in Haskell are "lifted", i.e., they may contain bottom. In reality, this means that even Int is represented by a pointer, either to a machine Int or a computation (which could diverge, i.e.,bbottom).

Here is a quick primer on boxed vs. lifted (vs. primitive). I.e., Array# is not lifted, as it may not be bottom, but it is boxed, as it may contain lifted values.

So what does that have to do with non-strict evaluation? A non-evaluated computation is called a "thunk". Having a linked list of pointers to Ints is way worse for cache locality then having a compact array of machine integers and so is chasing pointers to thunks.

That is why there has been much research and engineering on demand analysis - if a the value of a computation will be needed anyway, it can be made strict.

AFAIK, "boxity" analysis is part of demand analysis. Anyway, GHC will try to get rid of pointers where possible.

But unfortunately I don't have any empirical data or studies on this.

EDIT: Some more reading on strictness/boxity analysis:

GHC -fstrictness

Demand analyzer in GHC

SO Answer by Richard Eisenberg, giving some more clarification on "levity" (lifted/unlifted), bottom and undefined.

Edit2:

Found a paper from 2002: Nethercote, Mycroft; The cache behaviour of large lazy functional programs on stock hardware

like image 154
Benjamin Maurer Avatar answered Oct 11 '22 09:10

Benjamin Maurer