I am learning <code>Arrow</code> following the tutorial programming with arrows. I've typed the following code according to the paper except that the <code>SF</code> is defined by <code>data</code>, not by <code>newtype</code> as in the paper (actually, I made this change by chance, since I typed the code from memory): <pre class="prettyprint"><code>import Control.Category import Control.Arrow import Prelude hiding (id, (.)) data SF a b = SF { runSF :: [a] -> [b] } -- this is the change, using data instead of newtype as in the paper -- The folowing code is the same as in the paper instance Category SF where id = SF $ \x -> x (SF f) . (SF g) = SF $ \x -> f (g x) instance Arrow SF where arr f = SF $ map f first (SF f) = SF $ unzip >>> first f >>> uncurry zip instance ArrowChoice SF where left (SF f) = SF $ \xs -> combine xs (f [y | Left y <- xs]) where combine (Left _ : ys) (z:zs) = Left z : combine ys zs combine (Right y : ys) zs = Right y : combine ys zs combine [] _ = [] delay :: a -> SF a a delay x = SF $ init . (x:) mapA :: ArrowChoice a => a b c -> a [b] [c] mapA f = arr listcase >>> arr (const []) ||| (f *** mapA f >>> arr (uncurry (:))) listcase :: [a] -> Either () (a, [a]) listcase [] = Left () listcase (x:xs) = Right (x, xs) </code></pre> When I load the file in <code>ghci</code> and execute <code>runSF (mapA (delay 0)) [[1,2,3],[4,5,6]]</code>, it triggers an infinit loop and runs out of memory finally. If I change <code>data</code> back to <code>newtype</code>, everything is OK. The same problem happens in ghc 8.0.2, 8.2.2 and 8.6.3. The same problem also exists even I compile the code into an executable. I have thought the difference between <code>data</code> and <code>newtype</code>, when defining a data structure with only one field, is the runtime cost. But this problem seems to imply more difference between them. Or there may be something that I haven't noticed about the <code>Arrow</code> type-class. Can anyone have any ideas? Thanks very much!

Let's look at this example. <pre class="prettyprint"><code>data A = A [Int] deriving (Show) cons :: Int -> A -> A cons x (A xs) = A (x:xs) ones :: A ones = cons 1 ones </code></pre> We would expect that <code>ones</code> should be <code>A [1,1,1,1...]</code>, because all we have done is wrap a list in a <code>data</code> constructor. But we would be wrong. Recall that pattern matches are strict for <code>data</code> constructors. That is, <code>cons 1 undefined = undefined</code> rather than <code>A (1 : undefined)</code>. So when we try to evaluate <code>ones</code>, <code>cons</code> pattern matches on its second argument, which causes us to evaluate <code>ones</code>... we have a problem. <code>newtype</code>s don't do this. At runtime <code>newtype</code> constructors are invisible, so it's as if we had written the equivalent program on plain lists <pre class="prettyprint"><code>cons :: Int -> [Int] -> [Int] cons x ys = x:ys ones = cons 1 ones </code></pre> which is perfectly productive, since when we try to evaluate <code>ones</code>, there is a <code>:</code> constructor between us and the next evaluation of <code>ones</code>. You can get back the <code>newtype</code> semantics by making your data constructor pattern matches lazy: <pre class="prettyprint"><code>cons x ~(A xs) = A (x:xs) </code></pre> This is the problem with your code (I have run into this exact problem doing this exact thing). There are a few reasons <code>data</code> pattern matches are strict by default; the most compelling one I see is that pattern matching would otherwise be impossible if the type had more than one constructor. There is also a small runtime overhead to lazy pattern matching in order to fix some subtle GC leaks; details linked in the comments.

Why `data` cause an infinite loop while `newtype` not

Tags:

haskell

newtype

arrows

I am learning Arrow following the tutorial programming with arrows. I've typed the following code according to the paper except that the SF is defined by data, not by newtype as in the paper (actually, I made this change by chance, since I typed the code from memory):

import Control.Category
import Control.Arrow
import Prelude hiding (id, (.))

data SF a b = SF { runSF :: [a] -> [b] }  -- this is the change, using data instead of newtype as in the paper 

-- The folowing code is the same as in the paper
instance Category SF where
  id = SF $ \x -> x
  (SF f) . (SF g) = SF $ \x -> f (g x)

instance Arrow SF where
  arr f = SF $ map f
  first (SF f) = SF $ unzip >>> first f >>> uncurry zip

instance ArrowChoice SF where
  left (SF f) = SF $ \xs -> combine xs (f [y | Left y <- xs])
    where
      combine (Left _ : ys) (z:zs) = Left z : combine ys zs
      combine (Right y : ys) zs = Right y : combine ys zs
      combine [] _ = []

delay :: a -> SF a a
delay x = SF $ init . (x:)

mapA :: ArrowChoice a => a b c -> a [b] [c]
mapA f = arr listcase >>>
         arr (const []) ||| (f *** mapA f >>> arr (uncurry (:)))

listcase :: [a] -> Either () (a, [a])
listcase [] = Left ()
listcase (x:xs) = Right (x, xs)

When I load the file in ghci and execute runSF (mapA (delay 0)) [[1,2,3],[4,5,6]], it triggers an infinit loop and runs out of memory finally. If I change data back to newtype, everything is OK. The same problem happens in ghc 8.0.2, 8.2.2 and 8.6.3.

The same problem also exists even I compile the code into an executable.

I have thought the difference between data and newtype, when defining a data structure with only one field, is the runtime cost. But this problem seems to imply more difference between them. Or there may be something that I haven't noticed about the Arrow type-class.

Can anyone have any ideas? Thanks very much!

726

asked Apr 01 '19 05:04

Z-Y.L

Video Answer

1 Answers

Let's look at this example.

data A = A [Int]
    deriving (Show)

cons :: Int -> A -> A
cons x (A xs) = A (x:xs)

ones :: A
ones = cons 1 ones

We would expect that ones should be A [1,1,1,1...], because all we have done is wrap a list in a data constructor. But we would be wrong. Recall that pattern matches are strict for data constructors. That is, cons 1 undefined = undefined rather than A (1 : undefined). So when we try to evaluate ones, cons pattern matches on its second argument, which causes us to evaluate ones... we have a problem.

newtypes don't do this. At runtime newtype constructors are invisible, so it's as if we had written the equivalent program on plain lists

cons :: Int -> [Int] -> [Int]
cons x ys = x:ys

ones = cons 1 ones

which is perfectly productive, since when we try to evaluate ones, there is a : constructor between us and the next evaluation of ones.

You can get back the newtype semantics by making your data constructor pattern matches lazy:

cons x ~(A xs) = A (x:xs)

This is the problem with your code (I have run into this exact problem doing this exact thing). There are a few reasons data pattern matches are strict by default; the most compelling one I see is that pattern matching would otherwise be impossible if the type had more than one constructor. There is also a small runtime overhead to lazy pattern matching in order to fix some subtle GC leaks; details linked in the comments.

145

answered Oct 16 '22 13:10

luqui

Related questions
                            
                                How can I emulate pointers in Haskell?
                            
                                lacks an accompanying binding - What does it mean? How it works?
                            
                                haskell and Unix shell scripting
                            
                                Does tail recursion necessarily need an accumulator?
                            
                                How to construct generic Functor instances using GHC.Generics (or other similar frameworks)?
                            
                                How to print memory address of a list in Haskell
                            
                                Haskell - Lenses, use of 'to' function
                            
                                Why doesn't GHC complain when number constant out of range
                            
                                Xmonad extension to cycle recent windows
                            
                                Making monadic code shorter
                            
                                Replacing => in place of -> in function type signature
                            
                                Is this generalization of runST safe?
                            
                                What is the equivalent of OCaml's modules in Haskell?
                            
                                Is this a valid type and how do I satisfy it? (two dyadic functions being composed)
                            
                                Space leak with recursive list zipWith
                            
                                Constructor that lifts (via DataKinds) to * -> A
                            
                                Why does Haskell's 'even' function slow my program down? [duplicate]
                            
                                How to deal with Haskell's reserved keywords in record fields?
                            
                                Categorical structure in Haskell
                            
                                Why does my functional dependency conflict disappear when I expand the definition?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With