Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Proc syntax in Haskell Arrows leads to severe performance penalty

Tags:

haskell

arrows

Using proc notation for Arrow seems to kill performance in my project. Here is a toy example of the problem:

We define Coroutine newtype (mostly copying from Generalizing Streams into Coroutines) to represent Mealy machines (i.e. functions that carry some state) with instances of Category and Arrow, write scan wrapper function and evalList runner function for lists.

Then we have sumArr and sumArr' functions where the latter is the former called within proc block.

Compiling with stack ghc -- --make test.hs -O2 using ghc-8.0.2 on OS X I get runtime of 0.087 secs for sumArr and 3.263 secs for sumArr' (with a heavy memory footprint).

I would like to know if this in fact caused by using proc and if I can do something to have normal runtime behaviour when using proc notation (writing arrow code without it is painful). Thank you.

{-# LANGUAGE Arrows #-}
{-# LANGUAGE BangPatterns #-}

import Prelude hiding (id, (.))
import Control.Arrow
import Control.Category
import qualified Data.List as L

newtype Coroutine i o = Coroutine { runC :: i -> (o, Coroutine i o) }

instance Category Coroutine where
    id = Coroutine $ \i -> (i, id)

    cof . cog = Coroutine $ \i ->
        let (x, cog') = runC cog i
            (y, cof') = runC cof x
        in (y, cof' . cog')

instance Arrow Coroutine where
    arr f = Coroutine $ \i -> (f i, arr f)

    first co = Coroutine $ \(a,b) ->
        let (c, co') = runC co a in ((c,b), first co')

scan :: (o -> t -> o) -> o -> Coroutine t o
scan f = go where
    go i = Coroutine $ step i where
            step a b = let !a' = f a b in (a', go a')

evalList :: Coroutine a b -> [a] -> [b]
evalList a = L.map fst . L.drop 1 . L.scanl' (\(_, acc) v -> let !x = runC acc v in x) (undefined, a)

sumArr, sumArr' :: Coroutine Int Int
sumArr = scan (\acc x -> let !newAcc = acc + x in newAcc) 0
sumArr' = proc v -> do sumArr -< v

testData :: [Int]
testData = [1..1000000]

main = print $ L.last $ evalList sumArr' testData
like image 765
Artem Solod Avatar asked Jul 22 '17 23:07

Artem Solod


People also ask

What does -> mean in Haskell?

(->) is often called the "function arrow" or "function type constructor", and while it does have some special syntax, there's not that much special about it. It's essentially an infix type operator. Give it two types, and it gives you the type of functions between those types.

What are arrows in Haskell?

The Arrow (either (->) or MyArr ) is an abstraction of a computation. For a function b -> c , b is the input and c is the output. For a MyArr b c , b is the input and c is the output. 2) To actually run an arrow computation, you use a function specific to your arrow type.


1 Answers

Yeah, this is probably caused by proc notation. The desugaring is very low-level, introducing a lot of (needless) arrs and not taking advantage of &&& or *** at all.

For example, last I checked, this:

mulA f g = proc x -> do
  a <- f -< x
  b <- g -< x
  returnA -< a * b

Is desugared to something like this:

mulA f g = arr dup
  >>> first f
  >>> arr swap
  >>> first g
  >>> arr mul
  where
    dup x = (x, x)
    swap (x, y) = (y, x)
    mul = uncurry (*)

When it could be just this:

mulA f g = f &&& g >>> arr mul

And this:

proc x -> do
  a <- f -< x
  b <- g -< a
  returnA -< b

Becomes something like this:

arr id
  >>> f
  >>> arr id
  >>> g
  >>> arr id
  >>> returnA

Instead of this:

f >>> g

Moreover I don’t think there are any GHC rewrite rules that take advantage of the arrow laws to help account for this.

like image 159
Jon Purdy Avatar answered Nov 14 '22 01:11

Jon Purdy