I'm learning Haskell currently (being a programmer by trade, but this is my first attempt at a functional language).
I want to write a function that scans a list and returns both the minimum and maximum element of that list. Sort of what the Prelude functions minimum
and maximum
do, but both at the same time. I've come up with the following code:
import Data.List -- Declaration of rand minMax :: [Int] -> Maybe (Int, Int) minMax [] = Nothing minMax (x:xs) = Just (foldl' f (x, x) xs) where f (a, b) c = (if c < a then c else a, if c > b then c else b)
rand
is a function that generates an infinite list of numbers. The thing is that when I append the following main
function:
main = print $ minMax $ take 1000000 $ rand 7666532
compile and run all this with profiling, it shows me it uses over 200 MB of memory, so it's definitely not a constant-space function (which I'd like it to be).
The question is why and what should I change to fix it. As I understand foldl'
folds the list from left (same way it's generated) and is not lazy, so I don't see why the memory usage is so high. I'm pretty sure it's the minMax
function that is incorrect, as simply printing the said list, using
main = print $ take 1000000 $ rand 7666532
gives me 1MB usage, something that I understand and expect.
Constant space means that the amount of space that your algorithm uses is independent of the input parameters. Say you are given an array of size n. If the amount of space your algorithm uses scales with n, then it's not constant.
'Constant extra space' usually means the solution containing several variables, the amount of them is not depend on what the input is.
O(1) Complexity: We consider constant space complexity when the program doesn't contain any loop, recursive function, or call to any other functions.
of 0 vote. a space complexity of O(1) means that the space required by the algorithm to process data is constant; it does not grow with the size of the data on which the algorithm is operating.
Note that foldl'
forces the accumulator to weak head normal form. Since the accumulator is a tuple it does not force the evaluation of the two elements of the tuple.
If you explicitly force the two elements you get a constant-space function:
f (a, b) c = a `seq` b `seq` (if c < a then c else a, if c > b then c else b)
In your original program you are building a tuple of the kind: (<thunk>, <thunk>)
and every time f
is applied you simply build a tuple with bigger and bigger thunks. When finally this is consumed by print
the call to show
forces the full evaluation of the tuple and all the comparisons are made at that point.
Using seq
you instead force f
to evaluate the comparison at that moment, and thus the thunks contained in the accumulator are evaluated before performing the comparison. Hence the result is that the thunks stored in the accumulator have constant size.
What foldl'
does is simply avoid building the thunk: f (f (f ...) y) x
.
An alternative solution, as suggested by Jubobs, to avoid explicitly using seq
is to use a data type that has strict fields:
data Pair a b = Pair !a !b deriving Show
And so the code would become:
minMax :: [Int] -> Maybe (Pair Int Int) minMax [] = Nothing minMax (x:xs) = Just (foldl' f (Pair x x) xs) where f (Pair a b) c = Pair (if c < a then c else a) (if c > b then c else b)
This avoids thunks altogether.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With