Difference between concatMap f xs and concat $ map f xs?

Question

Presumably they do exactly the same thing, concatMap f xs and concat $ map f xs. Why would I choose one over another?

I imagine it may be an optimization. If so, is this still the case with GHC 7.8?

J. Abrahamson · Accepted Answer

It is the case that concatMap f xs = concat (map f xs) as you suspect. Thus, for correctness purposes you should consider them interchangeable. We can examine their definitions to learn a little more, though.

concatMap               :: (a -> [b]) -> [a] -> [b]
concatMap f             =  foldr ((++) . f) []

concat :: [[a]] -> [a]
concat = foldr (++) []

In particular, this means that concat . map f expands to foldr (++) [] . map f. Now using a thing known as the "universal property of fold" we can see that foldr g z . map f = foldr (g . f) z for any (g, z, f) such as the choice ((++), f, []) we use above. This demonstrates that concatMap f = concat . map f like we want.[0]

So why are they defined differently? Because foldr ((++) . f) [] is always going to be faster than foldr (++) [] . map f since, in a really pathological case, the latter suggests two separate recursions. Due to laziness, it's unlikely that two recursions would ever be performed, though, so what gives?

The real reason is that there are more complex fusion laws available to the compiler such as those which combine two sequential foldrs or which define interactions between foldr and unfoldr. These are kind of finicky to use as they depend upon being able to look at the surface syntax of a fragment of code and detect possible simplifications. A lot of work goes into getting consistently firing fusion laws.

But one thing we can do is encourage people to use higher order combinators with optimization laws pre-applied. Since foldr (++) [] . map f is never going to be faster than foldr ((++) . f) [] we can take a shortcut and pre-apply the universal law simplification. This will improve the likelihood of fusion laws firing elsewhere to best optimize a list production pipeline.

[0] Why does this law work? Roughly, the universal law of foldr states that if you have any function q such that q [] = z and q (a:as) = f a (q as) then that q must be and is foldr f z. Since q = foldr g z . map f can be shown to have q [] = z and q (a:as) = g (f a) (q as) then it must be a fold like foldr (g . f) z like we want.

Difference between concatMap f xs and concat $ map f xs?

Tags:

haskell

Ana

1 Answers

J. Abrahamson

Recent Activity

Donate For Us

Difference between concatMap f xs and concat $ map f xs?

Tags:

haskell

Ana

1 Answers

J. Abrahamson

Related questions

Recent Activity

Donate For Us