What is the difference between the following two formulas?
cp [] = [[]]
cp (xs:xss) = [x:ys | x <- xs, ys <- cp xss]
----------------------------------------------
cp [] = [[]]
cp (xs:xss) = [x:ys | x <- xs, ys <- yss]
where yss = cp xss
Sample output: cp [[1,2,3],[4,5]] => [[1,4],[1,5],[2,4],[2,5],[3,4],[3,5]]
According to Thinking Functionally With Haskell (p. 92), the second version is "a more efficient definition...[which] guarantees that cp xss is computed just once," though the author never explains why. I would have thought they were equivalent.
List comprehensions are great because they require less lines of code, are easier to comprehend, and are generally faster than a for loop. While list comprehensions are not the most difficult concept to grasp, it can definitely seem unintuitive at first (it certainly was for me!).
List comprehensions are a concise notation borrowed from the functional programming language Haskell. We can think of them like a syntactic sugar for the filter and map functions. We have seen that list comprehensions can be a good alternative to for loops because they are more compact and faster.
Yes, the list comprehension preserves the order of the original iterable (if there is one). If the original iterable is ordered (list, tuple, file, etc.), that's the order you'll get in the result. If your iterable is unordered (set, dict, etc.), there are no guarantees about the order of the items.
Every list comprehension can be rewritten as a for loop but not every for loop can be rewritten as a list comprehension. The key to understanding when to use list comprehensions is to practice identifying problems that smell like list comprehensions.
The two definitions are equivalent in the sense that they denote the same value, of course.
Operationally they differ in the sharing behavior under call-by-need evaluation. jcast already explained why, but I want to add a shortcut that does not require explicitly desugaring the list comprehension. The rule is: any expression that is syntactically in a position where it could depend on a variable x
will be recomputed each time the variable x
is bound to a value, even if the expression does not actually depend on x
.
In your case, in the first definition, x
is in scope in the position where cp xss
appears, so cp xss
will be re-evaluated for each element x
of xs
. In the second definition cp xss
appears outside the scope of x
so it will be computed just once.
Then the usual disclaimers apply, namely:
The compiler is not required to adhere to the operational semantics of call-by-need evaluation, only the denotational semantics. So it might compute things fewer times (floating out) or more times (floating in) than you would expect based on the above rule.
It's not true in general that more sharing is better. In this case, for example, it's probably not better because the size of cp xss
grows as quickly as the amount of work that it took to compute it in the first place. In this situation the cost of reading the value back from memory can exceed that of recomputing the value (due to the cache hierarchy and the GC).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With