Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why systematically mark record fields strict in Haskell?

Tags:

haskell

I have read several times that it is good practice to systematically mark a record's fields strict in Haskell, e.g. instead of

data Foo = Foo { bar :: Bar, quux :: Quux }

do

data Foo = Foo { bar :: !Bar, quux :: !Quux}

For example here is a quote from Haskell Programming from First Principles, p.1134:

A good rule to follow is lazy in the spine, strict in the leaves!

I (think I) understand what strictness is and what is the difference between the two. What I don't understand is why do the latter systematically?

like image 521
insitu Avatar asked May 24 '18 06:05

insitu


1 Answers

A common rule of thumb is to make data structures strict when:

  • You expect to strictly traverse the whole thing and retain it, so there’s no point in the overhead of laziness; or

  • The fields are “small”—less than or equal to the size of a pointer—and you’d like the compiler to unbox them to avoid unnecessary indirections. (-funbox-small-strict-fields is on by default since GHC 7.7.)

This tends to be roughly the same as “lazy in the spine, strict in the leaves” in practice, because data structures often benefit from being lazy but their contents often don’t.

Doing this “systematically” is a naïve way to help avoid space leaks—for example, if you do repeated modification of a data structure (such as an accumulator) without forcing the value in the meantime:

ghci> data Lazy = Lazy { lazyField :: Int } deriving (Show)

ghci> data Strict = Strict { strictField :: !Int } deriving (Show)

ghci> modifyLazy (Lazy field) = Lazy { lazyField = field + 1 }

ghci> modifyStrict (Strict field) = Strict { strictField = field + 1 }

ghci> lazy = iterate modifyLazy (Lazy 0) !! 1000000

ghci> strict = iterate modifyStrict (Strict 0) !! 1000000

ghci> :set +s

ghci> lazy
Lazy {lazyField = 1000000}
(0.76 secs, 251,792,080 bytes)

ghci> strict
Strict {strictField = 1000000}
(0.52 secs, 178,173,544 bytes)

The lazy version builds up a chain of thunks before forcing it; the strict version keeps the Int field fully evaluated at each step. A similar thing happens with modifyIORef (lazy) and modifyIORef' (strict).

It’s mostly pointless to add a strictness annotation on a field with a type that’s non-strict, for example, field :: ![Int] only ensures that the first (:) or [] constructor is forced; it doesn’t make the whole list strict. If you need that, you probably want a strict sequence type like Data.Vector. And it’s generally inadvisable to make a polymorphic field strict, as in data Foo a = Foo … !a …, because someone who uses the type might expect to be able to rely on laziness there—for example, in an algorithm that uses cycles (“tying the knot”)—and it’s annoying to have to wrap a type in an extra constructor like data Lazy a = Lazy a to regain laziness. Laziness also plays better with equational reasoning—although in practice we often use “fast and loose” equational reasoning that ignores strictness and non-total functions.

Ultimately, the only way to decide whether something should be lazy or strict is to consider the semantics you need for your particular application, and add judicious strictness (bangs on fields, BangPatterns, seq, strict functions like modifyIORef') after profiling if you encounter performance problems or space leaks.

like image 105
Jon Purdy Avatar answered Nov 15 '22 08:11

Jon Purdy