As in this code:
import Data.Char
groupsOf _ [] = []
groupsOf n xs =
take n xs : groupsOf n ( tail xs )
problem_8 x = maximum . map product . groupsOf 5 $ x
main = do t <- readFile "p8.log"
let digits = map digitToInt $concat $ lines t
print $ problem_8 digits
I realize that in Haskell many programs construct and seem to store some intermediate results, such as the groupsOf
list in the above code. The above code is the reference answer for Euler project problem 8, taken from Haskell website. The original question ask for the largest product of 5 consecutive digits in a very long chain of digits such as 45343231635723421312443535767983456
.
So the code will calculate all the products and store it as a list. In other languages I think they will only keep a temporary largest value and discard anything smaller.
So does groupsOf
really stores all the intermediate results? what if the problem scales up? Will it allocate too much memory?
No, not because of groupsOf
. That one will only need to keep one group in memory at a time. However, maximum
might build up a large thunk since it's too lazy, so make sure to either compile with -O
or -O2
so that strictness analysis is performed, or use foldl1' max
instead1.
1foldl1'
is found in Data.List
I realize that in Haskell many programs construct and seem to store some intermediate results, such as the groupsOf list in the above code.
"seem to" is a good thing to say here, because in all honesty this is where Haskell can really shine. It can, in fact, take up much less memory, without complicated micromanagement on your part, thanks to laziness.
One nifty thing about lazy IO (e.g. readFile
) is that it will read only as much of the file as necessary, and the file can be garbage-collected as you consume its contents. (Well, roughly so. It depends on how you set the buffering.)
Here's a rough sketch of how your program will be executed:
t <- readFile "p8.log"
Create a thunk for the t :: String
. Probably check that the file exists, and open it in read mode.
let digits = map digitToInt $concat $ lines t
Create a thunk for digits :: [Int]
print $ problem_8 digits
Once we attempt to execute this, all of the work will actually get done. We need to fully evaluate problem_8 digits :: Int
in order to print it. So we create a thunk for problem_8 digits
and force it.
maximum . map product . groupsOf 5 $ digits
At this point, maximum
needs the first two elements of map product . groupsOf 5 $ digits
in order to see which of the two is greater. Therefore map product
needs to see the first two elements of groupsOf 5 digits
to pass along.
Now, groupsOf 5
will need the first 5 elements of digits
in order to produce its first element. At this point, the first line of the file will probably be read, and go through the defined transformations. groupsOf
can construct its first element, and probably its second element (assuming there are more than 6 characters on that line). groupsOf 5 digits
passes its first two elements up the chain, we map product
onto those two things, and then maximum
checks which of the two is larger. We keep the result, the larger of the two.
At this point (in fact somewhat earlier than this point) we can entirely discard the intermediate results. The first two elements of groupOf 5 digits
are now entirely unnecessary; we will never need to inspect them again, and the garbage collector can collect them anytime. The first two characters of that file will not be used anymore either, and can be discarded.
Now, this is extremely handwavey and possibly slightly inaccurate. But do you get the idea? Haskell is lazy (well technically Haskell is "non-strict", which is generally implemented via laziness), which makes its execution style much different than any strict language. This allows us to use what appears to be tons of intermediate representations and data structures, without actually taking up tons of memory. Good compilers, such as GHC, can optimize memory usage like you wouldn't believe.
I'm not sure what you mean by "intermediate results" here, exactly... But your code looks fine to me. Specifically, your implementation of groupsOf
is tail recursive modulo cons, so I don't think you'll have to worry about stack overflow, if that's what you're concerned about.
I might be misunderstanding your question, though. Could you clarify?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With