Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lazy vs Strict implementations of data structures

There's a list of data structures having lazy and strict implementations:

  • Data.Map.Lazy and Data.Map.Strict
  • Data.IntMap.Lazy and Data.IntMap.Strict
  • Data.HashMap.Lazy and Data.HashMap.Strict
  • Data.ByteString.Lazy and Data.ByteString.Strict
  • Data.Text.Lazy and Data.Text

What are the strengths and weaknesses of those implementations and what are the rules to follow when choosing a specific one?

like image 623
Nikita Volkov Avatar asked Apr 30 '13 09:04

Nikita Volkov


People also ask

When to use lazy evaluation?

"Lazy" evaluation is performing operations when and as they are needed. It is useful when it is a feature of a programming language or library because it is generally harder to implement lazy evaluation on your own than simply to precalculate everything up front.

Is Haskell a lazy language?

Haskell is a lazy language. This means that the evaluation of expressions is delayed until their values are actually needed. The opposite is eager evaluation, which is what most programming languages implement, like C, Java, and Python.

Is Haskell lazy by default?

Haskell is often described as a lazy language. However, the language specification simply states that Haskell is non-strict, which is not quite the same thing as lazy.

Is Haskell strict?

Haskell is a non-strict language, and most implementations use a strategy called laziness to run your program.


2 Answers

  • Data.XYMap.Lazy and Data.XYMap.Strict

for XY in {"", "Int", "Hash"}: The *.Strict variant forces the evaluation of the mapped-to values to WHNF before they are placed in the map.

The big advantage of that is more predictable space and time behaviour, since it is much harder to build up huge thunks, especially for types of the form (ConstructorN UnboxedValueTypeN), that is impossible.

The disadvantage - I remember there were examples brought up when it was discussed whether the strict or lazy variants should become the default, but I don't remember anything in particular.

Ah, just remembered a use case: One can tie a knot with the Lazy variants, that is of course impossible with the Strict versions! So if you do such things: Lazy.

I use the Strict versions by default. Until I need to tie knots or encounter another use case where I consider the Lazy variants superior, I don't know when I would use them.

  • Data.(ByteString/Text).Lazy and Data.(ByteString/Text).Strict

The strict versions use one monolithic chunk of storage for the payload, that means you have fast random access, not only sequentially, but also backwards from the end, or jumping to and fro.

The lazy versions are basically head-strict lists of strict chunks, their strength is that sequential consumption of them can often be done in constant small memory, that is great if you need to sequentially process large files.

For small(ish) data, definitely use the Strict variants, for huge data the Lazy variants if the data is processed (more or less) sequentially.

like image 80
Daniel Fischer Avatar answered Sep 20 '22 17:09

Daniel Fischer


What are the strengths and weaknesses of those implementations and what are the rules to follow when choosing a specific one?

The strictness or laziness of the type lead to different complexity of particular operations, or modes of use.

There are no hard or fast rules - instead, you might like to think of them as entirely different data types.

If you insist on some guidelines:

  • lazy structures for data larger than memory
  • lazy structures for infrequently used data or when you use a small part of a large structure

And then:

  • strict structures if you do lots of updates
  • strict structures for small, atomic data
like image 38
Don Stewart Avatar answered Sep 18 '22 17:09

Don Stewart