Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to derive FRP from Directed Acyclic Graphs?

I am currently researching for my next project. This is in a pre-planning phase so this question is just to get an overview on existing technology.

Setup

I have a directed acyclic graph (DAG) with multiple inputs and output, think artificial neuronal network for now:

Directed acyclic graph

The common way to process such a structure is to process the whole network on every (time-)step. I believe that is the method used by frp libraries such as netwire.

Now I am in the fortunate position that I have a stream of events that each presents the change in one of the input nodes. The idea is that I probably don't have to step each node in the network if I can statically know that a given change will only affect a part of it.

Example

In the image above 5, 7 and 3 are inputs, 11 and 8 are 'hidden' and 2, 9 and 10 are output nodes. A change at node 5 will only affect node 11 and in effect nodes 2, 9 and 10. I won't need to process nodes 7, 3 and 8.

The goal

Process this kind of network with as little latency as possible. The size of the graphs will probably reach up to 100k nodes, with a moderate amount of calculation being done per node.

The plan

I hope that someone will step up and advertise library X that just gets the job done.

Otherwise my current plan is to derive a calculation per input node from the graph description. Probably I will use the Par monad so that I won't have to deal with data dependencies myself and still get to benefit from multicore machines.

The questions

  • Is there any library out there that just does what I need?
  • Is my Par plan feasible? How much does this depend on the amount of processing needed in each node?
like image 867
fho Avatar asked Sep 11 '14 12:09

fho


1 Answers

Problems like this are usually encoded against either an Applicative or Arrow abstraction. I'll only discuss Applicative. The Applicative type class, found in Control.Applicative, allows values and functions to be provided via pure and functions to be applied to values with <*>.

class Functor f => Applicative f where
    -- | Lift a value.
    pure :: a -> f a

    -- | Sequential application.
    (<*>) :: f (a -> b) -> f a -> f b

Your example graph could be very simply encoded for an Applicative (replacing each node with addition) as

example1 :: (Applicative f, Num a) => f a -> f a -> f a -> f (a, a, a)
example1 five seven three = 
    let
        eleven = pure (+) <*> five   <*> seven
        eight  = pure (+) <*> seven  <*> three
        two    = pure id  <*> eleven
        nine   = pure (+) <*> eleven <*> eight
        ten    = pure (+) <*> eleven <*> three
    in
        pure (,,) <*> two <*> nine <*> ten

The same encoding can be created programmatically from the representation of a graph by traversing the graph such that each node will be visited after all of its dependencies.

There are three optimizations you may desire that can't be implemented for a network encoded using only Applicative. The general strategy is to encode the problem in terms of Applicative and a few additional classes as needed for optimization or extra features. You then provide one or more interpreters that implement the necessary classes. This lets you separate the problem from the implementation, allowing you to write your own interpreter or use an existing library like reactive, reactive-banana, or mvc-updates. I am not going to discuss how to write these interpreters or adapt the representation given here to a specific interpreter. I'm only going to discuss the common representation of the program that's needed in order for the underlying interpreter to be able to exploit these optimizations. All three of the libraries I linked can avoid recomputing values and can accommodate the following optimizations.

Observable Sharing

In the previous example, the intermediate node eleven is only defined once, but is used in three different places. An implementation of Applicative won't be able to see through the referential transparency to see that these three elevens are all the same. You can either assume that the implementation uses some IO magic to peek through referential transparency or define the network so that an implementation can see that a computation is being reused.

The following is the class of Applicative Functors where the result of a computation can be divided and reused in multiple computations. This class isn't defined in a library anywhere that I know of.

class Applicative f => Divisible f where
    (</>) :: f a -> (f a -> f b) -> f b

infixl 2 </>

Your example can then be very simply encoded for a Divisible Functor as

example2 :: (Divisible f, Num a) => f a -> f a -> f a -> f (a, a, a)
example2 five seven three = 
    pure (+) <*> five   <*> seven </> \eleven ->
    pure (+) <*> seven  <*> three </> \eight  ->
    pure id  <*> eleven           </> \two    ->
    pure (+) <*> eleven <*> eight </> \nine   ->
    pure (+) <*> eleven <*> three </> \ten    ->
    pure (,,) <*> two <*> nine <*> ten

Sums and Abelian Groups

A typical neuron computes a weighted sum of its inputs and applies a response function to that sum. For a neuron with a large in degree, summing all of its inputs is time consuming. An easy optimization for updating a sum is to subtract the old value and add the new value. This exploits three properties of addition:

Inverse - a * b * b⁻¹ = a Subtraction is the inverse of adding a number, this inverse allows us to remove the previously added number from the total

Commutativity - a * b = b * a Addition and subtraction reach the same result regardless of the order they are performed in. This lets us reach the same result when we subtract the old value and add the new value to the total even if the old value wasn't the most recently added value.

Associativity - a * (b * c) = (a * b) * c Addition and subtraction can be performed in arbitrary groupings and still reach the same result. This lets us reach the same result when we subtract the old value and add the new value to the total even it the old value was added somewhere in the middle of the additions.

Any structure with these properties as well as closure and an identity is an Abelian group. The following dictionary holds enough information for an underlying library to perform the same optimization

data Abelian a = Abelian {
    id  :: a,
    inv :: a -> a,
    op  :: a -> a -> a
}

This is the class of structures that can total Abelian groups

class Total f where 
    total :: Abelian a -> [f a] -> f a

Similar optimization is possible for the construction of maps.

Thresholding and Equality

Another typical operation in neural networks is to compare a value to a threshold and determine the response entirely based on whether or not the value passed the threshold. If an update to an input doesn't change which side of the threshold the value falls on, the response doesn't change. If the response doesn't change, there's no reason to recalculate all of the downstream nodes. The ability to either detect that there was no change to the threshold Bool or the response is equality. The following is the class of structures that can exploit equality. stabilize provides the Eq instance to the underlying structure.

class Stabilizes f where
    stabilize :: Eq a => f a -> f a
like image 160
Cirdec Avatar answered Oct 20 '22 10:10

Cirdec