Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do laziness and parallelism coexist in Haskell?

People argue that Haskell has an advantage in parallelism since it has immutable data structures. But Haskell is also lazy. It means data actually can be mutated from thunk to evaluated result.

So it seems laziness can harm the advantage of immutability. Am I wrong or does Haskell have countermeasures for this problem? Or is this Haskell's own feature?

like image 509
damhiya Avatar asked Aug 12 '19 00:08

damhiya


People also ask

Why is Haskell a lazy language?

Haskell is a lazy language. This means that the evaluation of expressions is delayed until their values are actually needed. The opposite is eager evaluation, which is what most programming languages implement, like C, Java, and Python.

Does Haskell support lazy processing?

Haskell uses a special form of evaluation called lazy evaluation. In lazy evaluation, no code is evaluated until it's needed. In the case of longList , none of the values in the list were needed for computation.

How does lazy evaluation work in Haskell?

Lazy evaluation is a method to evaluate a Haskell program. It means that expressions are not evaluated when they are bound to variables, but their evaluation is deferred until their results are needed by other computations.

Does Haskell support lazy evaluation?

Haskell is a lazy language, meaning that it employs lazy evaluation . Before explaining lazy evaluation , let's first explain strict evaluation which most readers will likely be more familiar with. Strict evaluation means that arguments to functions are evaluated prior to being passed to the functions.


1 Answers

Yes, GHC’s RTS uses thunks to implement non-strict evaluation, and they use mutation under the hood, so they require some synchronisation. However, this is simplified due to the fact that most heap objects are immutable and functions are referentially transparent.

In a multithreaded program, evaluation of a thunk proceeds as follows:

  • The thunk is atomically replaced with a BLACKHOLE object

  • If the same thread attempts to force the thunk after it’s been updated to a BLACKHOLE, this represents an infinite loop, and the RTS throws an exception (<<loop>>)

  • If a different thread attempts to force the thunk while it’s a BLACKHOLE, it blocks until the original thread has finished evaluating the thunk and updated it with a value

  • When evaluation is complete, the original thread atomically replaces the thunk with its result

e.g., using a compare-and-swap (CAS) instruction

So there is a potential race here: if two threads attempt to force the same thunk at the same time, they may both begin evaluating it. In that case, they will do some redundant work—however, one thread will succeed in overwriting the BLACKHOLE with the result, and the other thread will simply discard the result that it calculated, because its CAS will fail.

Safe code cannot detect this, because it can’t obtain the address of an object or determine the state of a thunk. And in practice, this type of collision is rare for a couple of reasons:

  • Concurrent code generally partitions workloads across threads in a manner suited to the particular problem, so there is low risk of overlap

  • Evaluation of thunks is generally fairly “shallow” before you reach weak head normal form, so the probability of a “collision” is low

So thunks ultimately provide a good performance tradeoff when implementing non-strict evaluation, even in a concurrent context.

like image 151
Jon Purdy Avatar answered Sep 23 '22 02:09

Jon Purdy