Well, it turns out that I got this function defined in my program code:
st_zipOp :: (a -> a -> a) -> Stream a -> Stream a -> Stream a
st_zipOp f xs ys = St.foldr (\x r -> st_map (f x) r) xs ys
It does what it seems to do. It zips (applying the operator several times, yes) two elements of type Stream a
, which is a list-like type, using an inner operator of the type a
. The definition is pretty straightforward.
Once I had defined the function this way, I tried this other version:
st_zipOp :: (a -> a -> a) -> Stream a -> Stream a -> Stream a
st_zipOp = St.foldr . (st_map .)
As far as I know, this is exactly the same definition as above. It is just a point-free version of the previous definition.
However, I wanted to check if there was any performance change, and I found that, indeed, the point-free version made the program run slightly worse (both in memory and time).
Why is this happening? If it is necessary, I can write a test program that reproduces this behavior.
I am compiling with -O2
if that makes a difference.
I wrote the following code, trying to reproduce the behavior explained above. I used lists this time, and the change in the performance was less noticeable, but still existent. This is the code:
opEvery :: (a -> a -> a) -> [a] -> [a] -> [a]
opEvery f xs ys = foldr (\x r -> map (f x) r) xs ys
opEvery' :: (a -> a -> a) -> [a] -> [a] -> [a]
opEvery' = foldr . (map .)
main :: IO ()
main = print $ sum $ opEvery (+) [1..n] [1..n]
where
n :: Integer
n = 5000
The profiling results using opEvery
(explicit arguments version):
total time = 2.91 secs (2906 ticks @ 1000 us, 1 processor)
total alloc = 1,300,813,124 bytes (excludes profiling overheads)
The profiling results using opEvery'
(point free version):
total time = 3.24 secs (3242 ticks @ 1000 us, 1 processor)
total alloc = 1,300,933,160 bytes (excludes profiling overheads)
However, I expected both versions to be equivalent (in all senses).
For the simple test case, both versions yield the same core when compiled with optimisations, but without profiling.
When compiling with profiling enabled (-prof -fprof-auto
), the pointfull version gets inlined, resulting in the main part being
Rec {
Main.main_go [Occ=LoopBreaker]
:: [GHC.Integer.Type.Integer] -> [GHC.Integer.Type.Integer]
[GblId, Arity=1, Str=DmdType S]
Main.main_go =
\ (ds_asR :: [GHC.Integer.Type.Integer]) ->
case ds_asR of _ {
[] -> xs_r1L8;
: y_asW ys_asX ->
let {
r_aeN [Dmd=Just S] :: [GHC.Integer.Type.Integer]
[LclId, Str=DmdType]
r_aeN = Main.main_go ys_asX } in
scctick<opEvery.\>
GHC.Base.map
@ GHC.Integer.Type.Integer
@ GHC.Integer.Type.Integer
(GHC.Integer.Type.plusInteger y_asW)
r_aeN
}
end Rec }
(you get something better without profiling).
When compiling the pointfree version with profiling enabled, opEvery'
is not inlined, and you get
Main.opEvery'
:: forall a_aeW.
(a_aeW -> a_aeW -> a_aeW) -> [a_aeW] -> [a_aeW] -> [a_aeW]
[GblId,
Str=DmdType,
Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=False,
ConLike=False, WorkFree=False, Expandable=False,
Guidance=IF_ARGS [] 80 60}]
Main.opEvery' =
\ (@ a_c) ->
tick<opEvery'>
\ (x_ass :: a_c -> a_c -> a_c) ->
scc<opEvery'>
GHC.Base.foldr
@ a_c
@ [a_c]
(\ (x1_XsN :: a_c) -> GHC.Base.map @ a_c @ a_c (x_ass x1_XsN))
Main.main4 :: [GHC.Integer.Type.Integer]
[GblId,
Str=DmdType,
Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=False,
ConLike=False, WorkFree=False, Expandable=False,
Guidance=IF_ARGS [] 40 0}]
Main.main4 =
scc<main>
Main.opEvery'
@ GHC.Integer.Type.Integer
GHC.Integer.Type.plusInteger
Main.main7
Main.main5
When you add an {-# INLINABLE opEvery' #-}
pragma, it can be inlined even when compiling for profiling, giving
Rec {
Main.main_go [Occ=LoopBreaker]
:: [GHC.Integer.Type.Integer] -> [GHC.Integer.Type.Integer]
[GblId, Arity=1, Str=DmdType S]
Main.main_go =
\ (ds_asz :: [GHC.Integer.Type.Integer]) ->
case ds_asz of _ {
[] -> lvl_r1KU;
: y_asE ys_asF ->
GHC.Base.map
@ GHC.Integer.Type.Integer
@ GHC.Integer.Type.Integer
(GHC.Integer.Type.plusInteger y_asE)
(Main.main_go ys_asF)
}
end Rec }
which is even a bit faster than the pragma-less pointfull version, since it doesn't need to tick the counters.
It is likely that a similar effect occurred for the Stream
case.
The takeaway:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With