Reasoning laziness

Tags:

lazy-evaluation

I have the following snippet:

import qualified Data.Vector as V
import qualified Data.ByteString.Lazy as BL
import System.Environment
import Data.Word
import qualified Data.List.Stream as S

histogram ::  [Word8] -> V.Vector Int
histogram c = V.accum (+) (V.replicate 256 0) $ S.zip (map fromIntegral c) (S.repeat 1)

mkHistogram file = do
  hist <- (histogram . BL.unpack) `fmap` BL.readFile file
  print hist

I see it like this: Nothing is done until printing. When printing the thunks are unwinded by first unpacking, then mapping fromIntegral one Word8 at a time. Each of these word8's are zipped with 1, again one value at a time. This tuples are then taken by the accumulator function which updates the array, one tuple/Word8 at a time. Then we move to the next thunk and repeat until no more content left.

This would allow for creating histograms in constant memory, but alas this is not happening, but instead it crashes with stack overflow. If I try to profile it, I see it running to the end, but taking memory a lot (300-500 Mb for a 2.5 Mb file). Memory is obtained linearly until the end until it can be released, forming a "nice" triangular graph.

Where did my reasoning go wrong and what steps should I take to make this run in constant memory?

527

asked Mar 11 '11 15:03

Masse

1 Answers

I believe the problem is that Data.Vector is not strict in its elements. So although your reasoning is right, when accumulating the histogram your thunks looks like:

<1+(1+(1+0)) (1+(1+0)) 0 0 (1+(1+(1+(1+0)))) ... >

Rather than

<3 2 0 0 4 ...>

And only when you print are those sums computed. I don't see a strict accum function in the docs (shame), and there isn't any place to hook in a seq. One way out of this predicament may be to use Data.Vector.Unboxed instead, since unboxed types are unlifted aka strict. Maybe you could request a strict accum function with your example as a use case.

192

answered Oct 17 '22 06:10

luqui

Related questions
                            
                                Matching multiple data type constructors at once
                            
                                Haskell / GHC -- is there any infix tag / pragma for "warn incomplete patterns"
                            
                                Haskell: some and many [duplicate]
                            
                                How to get rid of $(...) and [| ... |] syntax when using a Template Haskell function?
                            
                                Creative uses of arrows
                            
                                Haskell: TVar: orElse
                            
                                Efficient large file upload with Yesod
                            
                                Functional-Banana Traveller Game - Intriguing and Maddening
                            
                                How much is applicative really about applying, rather than "combining"?
                            
                                Correct design for Haskell exception handling
                            
                                Are denotational semantic mappings decidable?
                            
                                Why is GHC distributed with gcc and g++?
                            
                                Haskell equivalent of C# 5 async/await
                            
                                using types to prevent conflicting port numbers in a list
                            
                                Can GHCJS/Haste compile themselves?
                            
                                What are some types that discriminate between categories?
                            
                                Statically link C++ library with a Haskell library
                            
                                Examples of "undoable" applicative functors?
                            
                                Testing QuickCheck properties against multiple types?
                            
                                The reason for MonadState get and put?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With