Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does "floated out" mean?

Tags:

On the Haskell wiki I read that this:

fib =     let fib' 0 = 0         fib' 1 = 1         fib' n = fib (n - 1) + fib (n - 2)     in  (map fib' [0 ..] !!) 

is more efficient than this:

 fib x =     let fib' 0 = 0         fib' 1 = 1         fib' n = fib (n - 1) + fib (n - 2)     in  map fib' [0 ..] !! x 

Because, "In the second case fib' is (re-)defined for every argument x, thus it cannot be floated out."

I don't understand what this means.

  1. What does "floated out" mean? How is it an optimization?
  2. Why is fib' being redefined for each invocation of fib?
  3. Is this an eta-expansion or not?
like image 665
Pubby Avatar asked Jan 08 '12 17:01

Pubby


People also ask

What does it mean if something is floated?

: to rest on the surface of or be suspended in a fluid. : to drift on or through or as if on or through a fluid. yellow leaves floated down. : wander.

What is the example of float?

float verb (MOVE ON LIQUID) to stay or move easily on or over the surface of a liquid, or to cause something to move in this way: [ I ] An empty bottle will float on water.

What does a float?

The term float refers to the regular shares a company has issued to the public that are available for investors to trade. This figure is derived by taking a company's outstanding shares and subtracting any restricted stock, which is stock that is under some sort of sales restriction.


2 Answers

It's not a very good explanation.

"Floated out" simply means that in:

\x -> let y = ... in z 

if ... does not mention x then it can be floated out of the lambda:

let y = ... in \x -> z 

which means it will only be computed once,1 which could save a lot of time if ... is expensive. However, GHC is conservative about performing optimisations like this, since they can introduce space leaks. (Though it does do so for the second definition if you give it a type signature, as Daniel Fischer points out in his answer.)

This isn't about automatic optimisation, though. The first snippet defines fib' outside of the lambda, whereas the second defines it inside (the lambda is implicit in fib x = ..., which is equivalent to fib = \x -> ...), which is what the quote is saying.

Even that's not really relevant, however; what's relevant is that in the first snippet, map fib' [0 ..] occurs outside the lambda, and so its result is shared among all applications of the lambda (in that code, the "lambda" arises from the partial application of (!!)). In the latter, it's inside the lambda, and so likely to be recomputed for every application of fib.

The end result is that the former implementation caches the values and so is far more efficient than the latter. Note that the first snippet's efficiency is dependent on the fact that fib' doesn't recurse directly, but instead through fib, and therefore benefits from the memoisation.

It's related to eta-expansion; the latter snippet is an eta-expansion of the first. But the statement you quoted doesn't explain what's going on at all.

1 Note that this is implementation-specific behaviour, and not part of Haskell's semantics. However, all reasonable implementations will behave in this manner.

like image 116
ehird Avatar answered Sep 19 '22 19:09

ehird


ehird's answer explains things very well, however there's one point

The end result is that the former implementation caches the values and so is far more efficient than the latter.

that is sometimes wrong.

If you compile the module containing either definition with optimisations (I checked only -O2, not -O1, and of course only GHC), there are several cases to consider:

  1. the first definition without type signature
  2. the second definition with type signature fib :: Int -> Integer
  3. the first definition with polymorphic type fib :: Num a => Int -> a
  4. the second definition without type signature
  5. the second definition with type signature fib :: Num a => Int -> a

In case 1, the monomorphism restriction produces the type fib :: Int -> Integer and the list map fib' [0 .. ] is shared across all invocations of fib. That means if you ever query fib (10^6), you have a list of the first million (+1) Fibonacci numbers in memory, and it will be collected only when the garbage collector can determine it's no longer used. That's often a memory leak.

In case 2, the result(ing core) is practically identical to case 1.

In case 4, the list isn't shared across different top-level invocations of fib (of course; the result can have many types, so there'd be many lists to share), but it's instantiated once per top-level invocation and reused for the calls from fib', so calculating fib n requires O(n) additions and O(n^2) steps through the list. That's not too bad. The list is collected when the calculation is complete, so no space leak.

Case 3 is practically identical to 4.

Case 5, however, is worse than the naive recursion. Since it's explicitly polymorphic and the list is bound inside the lambda, the list cannot be reused for the recursive calls, each recursive call creates a new list ...

like image 21
Daniel Fischer Avatar answered Sep 17 '22 19:09

Daniel Fischer