Over on Code Review, I answered a question about a naive Haskell fizzbuzz solution by suggesting an implementation that iterates forward, avoiding the quadratic cost of the increasing number of primes and discarding modulo division (almost) entirely. Here's the code:
fizz :: Int -> String
fizz = const "fizz"
buzz :: Int -> String
buzz = const "buzz"
fizzbuzz :: Int -> String
fizzbuzz = const "fizzbuzz"
fizzbuzzFuncs = cycle [show, show, fizz, show, buzz, fizz, show, show, fizz, buzz, show, fizz, show, show, fizzbuzz]
toFizzBuzz :: Int -> Int -> [String]
toFizzBuzz start count =
let offsetFuncs = drop (mod (start - 1) 15) fizzbuzzFuncs
in take count $ zipWith ($) offsetFuncs [start..]
As a further prompt, I suggested rewriting it using Data.List.unfoldr
. The unfoldr
version is an obvious, simple modification to this code so I'm not going to type it here unless people seeking to answer my question insist that is important (no spoilers for the OP over on Code Review). But I do have a question about the relative efficiency of the unfoldr
solution compared to the zipWith
one. While I am no longer a Haskell neophyte, I am no expert on Haskell internals.
An unfoldr
solution does not require the [start..]
infinite list, since it can simply unfold from start
. My thoughts are
zipWith
solution does not memoize each successive element of [start..]
as it is asked for. Each element is used and discarded because no reference to the head of [start..] is kept. So there is no more memory consumed there than with unfoldr
.unfoldr
and recent patches to make it always inlined are conducted at a level which I have not yet reached.So I think the two are equivalent in memory consumption but have no idea about the relative performance. Hoping more informed Haskellers can direct me towards an understanding of this.
unfoldr
seems a natural thing to use to generate sequences, even if other solutions are more expressive. I just know I need to understand more about it's actual performance. (For some reason I find foldr
much easier to comprehend on that level)
Note: unfoldr
's use of Maybe
was the first potential performance issue that occurred to me, before I even started investigating the issue (and the only bit of the optimisation/inlining discussions that I fully understood). So I was able to stop worrying about Maybe
right away (given a recent version of Haskell).
As the one responsible for the recent changes in the implementations of zipWith
and unfoldr
, I figured I should probably take a stab at this. I can't really compare them so easily, because they're very different functions, but I can try to explain some of their properties and the significance of the changes.
unfoldr
The old version of unfoldr
(before base-4.8
/GHC 7.10) was recursive at the top level (it called itself directly). GHC never inlines recursive functions, so unfoldr
was never inlined. As a result, GHC could not see how it interacted with the function it was passed. The most troubling effect of this was that the function passed in, of type (b -> Maybe (a, b))
, would actually produce Maybe (a, b)
values, allocating memory to hold the Just
and (,)
constructors. By restructuring unfoldr
as a "worker" and a "wrapper", the new code allows GHC to inline it and (in many cases) fuse it with the function passed in, so the extra constructors are stripped away by compiler optimizations.
For example, under GHC 7.10, the code
module Blob where
import Data.List
bloob :: Int -> [Int]
bloob k = unfoldr go 0 where
go n | n == k = Nothing
| otherwise = Just (n * 2, n+1)
compiled with ghc -O2 -ddump-simpl -dsuppress-all -dno-suppress-type-signatures
leads to the core
$wbloob :: Int# -> [Int]
$wbloob =
\ (ww_sYv :: Int#) ->
letrec {
$wgo_sYr :: Int# -> [Int]
$wgo_sYr =
\ (ww1_sYp :: Int#) ->
case tagToEnum# (==# ww1_sYp ww_sYv) of _ {
False -> : (I# (*# ww1_sYp 2)) ($wgo_sYr (+# ww1_sYp 1));
True -> []
}; } in
$wgo_sYr 0
bloob :: Int -> [Int]
bloob =
\ (w_sYs :: Int) ->
case w_sYs of _ { I# ww1_sYv -> $wbloob ww1_sYv }
The other change to unfoldr
was rewriting it to participate in "fold/build" fusion, an optimization framework used in GHC's list libraries. The idea of both "fold/build" fusion and the newer, differently balanced, "stream fusion" (used in the vector
library) is that if a list is produced by a "good producer", transformed by "good transformers", and consumed by a "good consumer", then the list conses never actually need to be allocated at all. The old unfoldr
was not a good producer, so if you produced a list with unfoldr
and consumed it with, say, foldr
, the pieces of the list would be allocated (and immediately become garbage) as computation proceeded. Now, unfoldr
is a good producer, so you can write a loop using, say, unfoldr
, filter
, and foldr
, and not (necessarily) allocate any memory at all.
For example, given the above definition of bloob
, and a stern {-# INLINE bloob #-}
(this stuff is a bit fragile; good producers sometimes need to be inlined explicitly to be good), the code
hooby :: Int -> Int
hooby = sum . bloob
compiles to the GHC core
$whooby :: Int# -> Int#
$whooby =
\ (ww_s1oP :: Int#) ->
letrec {
$wgo_s1oL :: Int# -> Int# -> Int#
$wgo_s1oL =
\ (ww1_s1oC :: Int#) (ww2_s1oG :: Int#) ->
case tagToEnum# (==# ww1_s1oC ww_s1oP) of _ {
False -> $wgo_s1oL (+# ww1_s1oC 1) (+# ww2_s1oG (*# ww1_s1oC 2));
True -> ww2_s1oG
}; } in
$wgo_s1oL 0 0
hooby :: Int -> Int
hooby =
\ (w_s1oM :: Int) ->
case w_s1oM of _ { I# ww1_s1oP ->
case $whooby ww1_s1oP of ww2_s1oT { __DEFAULT -> I# ww2_s1oT }
}
which has no lists, no Maybe
s, and no pairs; the only allocation it performs is the Int
used to store the final result (the application of I#
to ww2_s1oT
). The entire computation can reasonably be expected to be performed in machine registers.
zipWith
zipWith
has a bit of a weird story. It fits into the fold/build framework a bit awkwardly (I believe it works quite a bit better with stream fusion). It is possible to make zipWith
fuse with either its first or its second list argument, and for many years, the list library tried to make it fuse with either, if either was a good producer. Unfortunately, making it fuse with its second list argument can make a program less defined under certain circumstances. That is, a program using zipWith
could work just fine when compiled without optimization, but produce an error when compiled with optimization. This is not a great situation. Therefore, as of base-4.8
, zipWith
no longer attempts to fuse with its second list argument. If you want it to fuse with a good producer, that good producer had better be in the first list argument.
Specifically, the reference implementation of zipWith
leads to the expectation that, say, zipWith (+) [1,2,3] (1 : 2 : 3 : undefined)
will give [2,4,6]
, because it stops as soon as it hits the end of the first list. With the previous zipWith
implementation, if the second list looked like that but was produced by a good producer, and if zipWith
happened to fuse with it rather than the first list, then it would go boom.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With