Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does sharing refer to in the implementation of a functional programming language

Sharing means that temporary data is stored if it is going to be used multiple times. That is, a function evaluates it's arguments only once. An example would be:

let x = sin x in x * x

What other features contribute to sharing and how would they interact with the need for practical programs to perform IO?

like image 549
user1850254 Avatar asked May 07 '13 23:05

user1850254


3 Answers

Sharing is about a kind of equality: pointer equality. In Haskell's value land (semantic interpretation) there is no such thing as sharing. Values can only be equal if they have instances of Eq and then "equality" is defined to be the binary relation (==).

Sharing thus escapes the semantic interpretation by referring to this underlying pointer equality which exists by virtue of implementation instead of semantics. This is a useful side effect sometimes, though. Unfortunately, since Haskell is defined by its semantic interpretation, use of sharing is undefined behavior. It's tied to a particular implementation. A few libraries use sharing and thus have behavior tied to GHC.

There is one built-in sharing mechanism, though. This is exposed by the StableName interface. We generate StableName a objects using makeStableName :: a -> IO (StableName a) and have an instance Eq (StableName a)—thus StableName introduces some kind of equality for any type.

StableName equality is almost pointer equality. To quote the Haddock documentation

If sn1 :: StableName and sn2 :: StableName and sn1 == sn2 then sn1 and sn2 were created by calls to makeStableName on the same object.

Note that this is just an if statement, not an if and only if. The fact that two things can be "pointer equivalent" but still have different stable names sometimes is (a) forced by the flexibility Haskell leaves to the GC and (b) a loophole that allows StableNames to exist in any Haskell implementation even if there's no such thing as "pointer equality" in the implementation whatsoever.

These StableNames still have no semantic meaning, but since they're introduced in the IO monad that's "OK". Instead, they might be said to implement an (ironically) unstable subset of the smallest (most specific) equality relation possible on any type.

like image 125
J. Abrahamson Avatar answered Oct 19 '22 19:10

J. Abrahamson


The clearest example of sharing in functional programming comes from Clean, which is based on graph rewriting. There, a computation refers to a DAG, so we can view the expression (sin x) * (sin x) as

    (*)
  /     \
sin x   sin x

Graph rewriting systems have an explicit notion of sharing, so we could express that computation as

   (*)
   / \
   \ /
  sin x

pointing the multiplication node to the same node, thereby sharing the computation of sin x. Term rewriting systems do not have so explicit a notion of sharing, but the optimization is still relevant. In GHC, one can sometimes express sharing with local variables, e.g. rewriting

f x = (sin x) * (sin x)

into

f x = sinx * sinx
  where sinx = sin x

But since the two are semantically equivalent, the compiler is free to implement both the same way, with or without sharing. By my understanding, the GHC will generally preserve sharing introduced with local variables and sometimes introduce it (adding sharing to the first), but with no formal expression of sharing in term rewriting systems either behaviour is implementation dependent (see tel's comment and answer).

Sharing touches IO because side-effecting values cannot be shared. If we consider an impure language, there is a difference between

(string-append (read-line)
               (read-line))

and

(let ((s (read-line)))
  (string-append s s))

The first executes the IO action twice, requesting two lines from the user and appending them; the second "shares" the IO action, executing it once and appending it to itself. In general, sharing a pure computation reduces execution time without changing the result of the program, while sharing a side-effecting value (one that may change over time, or interacts with the user) changes the result. For the compiler to automatically share computations, it needs to know that they are pure, and thus that reducing the number of evaluations does not matter. Clean does this with uniqueness types; an IO action has the type attribute UNQ, which tells the compiler that it should not be shared. Haskell does the same somewhat differently with the IO monad.

like image 40
isturdy Avatar answered Oct 19 '22 18:10

isturdy


Your example is not an example of sharing -- it just multiplies a value with itself (and then throws the original value away).

Sharing is the case where some part of a data structure occurs twice in a larger data structure, or in different data structures. For example:

p  = (1, 2)
pp = (p, p)  -- p is shared between the first and second component of pp

xs = [1, 2, 3]
ys = 0::1::xs
zs = 5::xs  -- ys and zs share the same tail

In memory, such sharing will result in a DAG structure.

like image 40
Andreas Rossberg Avatar answered Oct 19 '22 20:10

Andreas Rossberg