Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sharing vs. non-sharing fixed-point combinator

This is the usual definition of the fixed-point combinator in Haskell:

fix :: (a -> a) -> a
fix f = let x = f x in x

On https://wiki.haskell.org/Prime_numbers, they define a different fixed-point combinator:

_Y   :: (t -> t) -> t
_Y g = g (_Y g)                -- multistage, non-sharing,  g (g (g (g ...)))
    -- g (let x = g x in x)    -- two g stages, sharing

_Y is a non-sharing fixpoint combinator, here arranging for a recursive "telescoping" multistage primes production (a tower of producers).

What exactly does this mean? What is "sharing" vs. "non-sharing" in that context? How does _Y differ from fix?

like image 628
Joseph Sible-Reinstate Monica Avatar asked Dec 11 '18 00:12

Joseph Sible-Reinstate Monica


People also ask

What are combinators in lambda calculus?

A combinator is a lambda expression with no free variables. So, λx. x is a combinator but λx. y is not a combinator.

Which programming construct is made possible by the Y Combinator?

Basically, it's this: The Y combinator allows us to define recursive functions in computer languages that do not have built-in support for recursive functions, but that do support first-class functions. That's it.

What is Y Combinator in programming?

The Y combinator is a formula which lets you implement recursion in a situation where functions can't have names but can be passed around as arguments, used as return values, and defined within other functions. It works by passing the function to itself as an argument, so it can call itself.

Can you recursion in lambda calculus?

Recursive definitions can also be used to define minimalization. It turns out that every recursive definition in the lambda calculus can be ``solved'' by finding its fixed point.


2 Answers

"Sharing" means f x re-uses the x that it creates; but with _Y g = g . g . g . g . ..., each g calculates its output anew (cf. this and this).

In that context, the sharing version has much worse memory usage, leads to a space leak.1

The definition of _Y mirrors the usual lambda calculus definition's effect for the Y combinator, which emulates recursion by duplication, while true recursion refers to the same (hence, shared) entity.

In

    x      = f x
    (_Y g) = g (_Y g)

both xs refer to the same entity, but each of (_Y g)s refer to equivalent, but separate, entity. That's the intention of it, anyway.

Of course thanks to referential transparency there's no guarantee in Haskell the language for any of this. But GHC the compiler does behave this way.

_Y g is a common sub-expression and it could be "eliminated" by a compiler by giving it a name and reusing that named entity, subverting the whole purpose of it. That's why the GHC has the "no common sub-expressions elimination" -fno-cse flag which prevents this explicitly. It used to be that you had to use this flag to achieve the desired behaviour here, but not anymore. GHC won't be as aggressive at common sub-expressions elimination anymore, with the more recent (read: several years now) versions.

disclaimer: I'm the author of that part of the page you're referring to. Was hoping for the back-and-forth that's usual on wiki pages, but it never came, so my work didn't get reviewed like that. Either no-one bothered, or it is passable (lacking major errors). The wiki seems to be largely abandoned for many years now.


1 The g function involved,

(3:) . minus [5,7..] . foldr (\ (x:xs) ⟶ (x:) . union xs) [] 
                      . map (\ p ⟶ [p², p² + 2p..])

produces an increasing stream of all odd primes given an increasing stream of all odd primes. To produce a prime N in value, it consumes its input stream up to the first prime above sqrt(N) in value, at least. Thus the production points are given roughly by repeated squaring, and there are ~ log (log N) of such g functions in total in the chain (or "tower") of these primes producers, each immediately garbage collectible, the lowest one producing its primes given just the first odd prime, 3, known a priori.

And with the two-staged _Y2 g = g x where { x = g x } there would be only two of them in the chain, but only the top one would be immediately garbage collectible, as discussed at the referenced link above.

like image 187
Will Ness Avatar answered Sep 25 '22 14:09

Will Ness


_Y is translated to the following STG:

_Y f = let x = _Y f in f x

fix is translated identically to the Haskell source:

fix f = let x = f x in x

So fix f sets up a recursive thunk x and returns it, while _Y is a recursive function, and importantly it’s not tail-recursive. Forcing _Y f enters f, passing a new call to _Y f as an argument, so each recursive call sets up a new thunk; forcing the x returned by fix f enters f, passing x itself as an argument, so each recursive call is into the same thunk—this is what’s meant by “sharing”.

The sharing version usually has better memory usage, and also lets the GHC RTS detect some kinds of infinite loop. When a thunk is forced, before evaluation starts, it’s replaced with a “black hole”; if at any point during evaluation of a thunk a black hole is reached from the same thread, then we know we have an infinite loop and can throw an exception (which you may have seen displayed as Exception: <<loop>>).

like image 43
Jon Purdy Avatar answered Sep 23 '22 14:09

Jon Purdy