I'm struggling to understand why these two snippets produce different results under the so-called "poor man's strictness analysis". The first example uses <code>data</code> (assuming a correct Applicative instance): <pre class="prettyprint"><code>data Parser t a = Parser { getParser :: [t] -> Maybe ([t], a) } > getParser (pure (,) <*> literal ';' <*> undefined ) "abc" *** Exception: Prelude.undefined </code></pre> The second uses <code>newtype</code>. There is no other difference: <pre class="prettyprint"><code>newtype Parser t a = Parser { getParser :: [t] -> Maybe ([t], a) } > getParser (pure (,) <*> literal ';' <*> undefined ) "abc" Nothing </code></pre> <code>literal x</code> is a parser that succeeds consuming one token of input if its argument matches the first token. So in this example, it fails since <code>;</code> doesn't match <code>a</code>. However, the <code>data</code> example still sees that the next parser is undefined, while the <code>newtype</code> example doesn't. I've read this, this, and this, but don't understand them well enough to get why the first example is undefined. It seems to me that in this example, <code>newtype</code> is more lazy than <code>data</code>, the opposite of what the answers said. (At least one other person has been confused by this too). Why does switching from <code>data</code> to <code>newtype</code> change the definedness of this example? <hr> Here's another thing I discovered: with this Applicative instance, the <code>data</code> parser above outputs undefined: <pre class="prettyprint"><code>instance Applicative (Parser s) where Parser f <*> Parser x = Parser h where h xs = f xs >>= \(ys, f') -> x ys >>= \(zs, x') -> Just (zs, f' x') pure a = Parser (\xs -> Just (xs, a)) </code></pre> whereas with this instance, the <code>data</code> parser above does not output undefined (assuming a correct Monad instance for <code>Parser s</code>): <pre class="prettyprint"><code>instance Applicative (Parser s) where f <*> x = f >>= \f' -> x >>= \x' -> pure (f' x') pure = pure a = Parser (\xs -> Just (xs, a)) </code></pre> <hr> Full code snippet: <pre class="prettyprint"><code>import Control.Applicative import Control.Monad (liftM) data Parser t a = Parser { getParser :: [t] -> Maybe ([t], a) } instance Functor (Parser s) where fmap = liftM instance Applicative (Parser s) where Parser f <*> Parser x = Parser h where h xs = f xs >>= \(ys, f') -> x ys >>= \(zs, x') -> Just (zs, f' x') pure = return instance Monad (Parser s) where Parser m >>= f = Parser h where h xs = m xs >>= \(ys,y) -> getParser (f y) ys return a = Parser (\xs -> Just (xs, a)) literal :: Eq t => t -> Parser t t literal x = Parser f where f (y:ys) | x == y = Just (ys, x) | otherwise = Nothing f [] = Nothing </code></pre>

As you probably know, the main difference between <code>data</code> and <code>newtype</code> is that with <code>data</code> the data constructors are lazy while with <code>newtype</code> the data constructors are strict, i.e. given the following types <pre class="prettyprint"><code>data D a = D a newtype N a = N a </code></pre> then <code>D &perp; `seq` x = x</code>, but <code>N &perp; `seq` x = &perp;</code>.(where <code>&perp;</code> stands for "bottom", i.e. undefined value or error) What is perhaps less commonly known, however, is that when you pattern match on these data constructors, the roles are "reversed", i.e. with <pre class="prettyprint"><code>constD x (D y) = x constN x (N y) = x </code></pre> then <code>constD x &perp; = &perp;</code> (strict), but <code>constN x &perp; = x</code> (lazy). This is what's happening in your example. <pre class="prettyprint"><code>Parser f <*> Parser x = Parser h where ... </code></pre> With <code>data</code>, the pattern match in the definition of <code><*></code> will diverge immediately if either of the arguments are <code>&perp;</code>, but with <code>newtype</code> the constructors are ignored and it is just as if you'd written <pre class="prettyprint"><code>f <*> x = h where </code></pre> which will only diverge for <code>x = &perp;</code> if <code>x</code> is demanded.

The difference between <code>data</code> and <code>newtype</code> is that <code>data</code> is "lifted" and <code>newtype</code> isn't. That means the <code>data</code> has an extra &perp; -- in this case, it means that <code>undefined</code> /= <code>Parser undefined</code>. When your <code>Applicative</code> code pattern-matches on <code>Parser x</code>, it forces a <code>&perp;</code> value if the constructor. When you pattern-match on a <code>data</code> constructor, it's evaluated and taken apart to make sure it's not &perp;. For example: <pre class="prettyprint"><code>λ> data Foo = Foo Int deriving Show λ> case undefined of Foo _ -> True *** Exception: Prelude.undefined </code></pre> So pattern matching on a <code>data</code> constructor is strict, and will force it. A <code>newtype</code>, on the other hand, is represented in exactly the same way as the type its constructor wraps. So matching on a <code>newtype</code> constructor does absolutely nothing: <pre class="prettyprint"><code>λ> newtype Foo = Foo Int deriving Show λ> case undefined of Foo _ -> True True </code></pre> There are probably two ways to change your <code>data</code> program such that it doesn't crash. One would be to use an irrefutable pattern match in your <code>Applicative</code> instance, which will always "succeed" (but using the matched values anywhere later might fail). Every <code>newtype</code> match behaves like an irrefutable pattern (since there's no constructor to match on, strictness-wise). <pre class="prettyprint"><code>λ> data Foo = Foo Int deriving Show λ> case undefined of ~(Foo _) -> True True </code></pre> The other would be to use <code>Parser undefined</code> instead of <code>undefined</code>: <pre class="prettyprint"><code>λ> case Foo undefined of Foo _ -> True True </code></pre> This match will succeed, because there is a valid <code>Foo</code> value that's being matched on. It happens to contained <code>undefined</code>, but that's not relevant since we don't use it -- we only look at the topmost constructor. <hr> In addition to all the links you gave, you might find this article relevant.

Laziness/strictness between data and newtype

Tags:

haskell

lazy-evaluation

algebraic-data-types

newtype

I'm struggling to understand why these two snippets produce different results under the so-called "poor man's strictness analysis".

The first example uses data (assuming a correct Applicative instance):

data Parser t a = Parser {
        getParser ::  [t] -> Maybe ([t], a) 
    }

> getParser (pure (,) <*> literal ';' <*> undefined ) "abc"
*** Exception: Prelude.undefined

The second uses newtype. There is no other difference:

newtype Parser t a = Parser {
        getParser ::  [t] -> Maybe ([t], a) 
    }

> getParser (pure (,) <*> literal ';' <*> undefined ) "abc"
Nothing

literal x is a parser that succeeds consuming one token of input if its argument matches the first token. So in this example, it fails since ; doesn't match a. However, the data example still sees that the next parser is undefined, while the newtype example doesn't.

I've read this, this, and this, but don't understand them well enough to get why the first example is undefined. It seems to me that in this example, newtype is more lazy than data, the opposite of what the answers said. (At least one other person has been confused by this too).

Why does switching from data to newtype change the definedness of this example?

Here's another thing I discovered: with this Applicative instance, the data parser above outputs undefined:

instance Applicative (Parser s) where
  Parser f <*> Parser x = Parser h
    where
      h xs = 
        f xs >>= \(ys, f') -> 
        x ys >>= \(zs, x') ->
        Just (zs, f' x')

  pure a = Parser (\xs -> Just (xs, a))

whereas with this instance, the data parser above does not output undefined (assuming a correct Monad instance for Parser s):

instance Applicative (Parser s) where
  f <*> x =
      f >>= \f' ->
      x >>= \x' ->
      pure (f' x')

  pure = pure a = Parser (\xs -> Just (xs, a))

Full code snippet:

import Control.Applicative
import Control.Monad (liftM)

data Parser t a = Parser {
        getParser ::  [t] -> Maybe ([t], a) 
    }


instance Functor (Parser s) where
  fmap = liftM

instance Applicative (Parser s) where
  Parser f <*> Parser x = Parser h
    where
      h xs = f xs >>= \(ys, f') -> 
        x ys >>= \(zs, x') ->
        Just (zs, f' x')

  pure = return


instance Monad (Parser s) where
  Parser m >>= f = Parser h
    where
      h xs =
          m xs >>= \(ys,y) ->
          getParser (f y) ys

  return a = Parser (\xs -> Just (xs, a))


literal :: Eq t => t -> Parser t t
literal x = Parser f
  where
    f (y:ys)
      | x == y = Just (ys, x)
      | otherwise = Nothing
    f [] = Nothing

537

asked Nov 26 '12 14:11

Matt Fenwick

2 Answers

As you probably know, the main difference between data and newtype is that with data the data constructors are lazy while with newtype the data constructors are strict, i.e. given the following types

data    D a = D a 
newtype N a = N a

then D ⊥ `seq` x = x, but N ⊥ `seq` x = ⊥.^{(where ⊥ stands for "bottom", i.e. undefined value or error)}

What is perhaps less commonly known, however, is that when you pattern match on these data constructors, the roles are "reversed", i.e. with

constD x (D y) = x
constN x (N y) = x

then constD x ⊥ = ⊥ (strict), but constN x ⊥ = x (lazy).

This is what's happening in your example.

Parser f <*> Parser x = Parser h where ...

With data, the pattern match in the definition of <*> will diverge immediately if either of the arguments are ⊥, but with newtype the constructors are ignored and it is just as if you'd written

f <*> x = h where

which will only diverge for x = ⊥ if x is demanded.

143

answered Oct 20 '22 02:10

hammar

The difference between data and newtype is that data is "lifted" and newtype isn't. That means the data has an extra ⊥ -- in this case, it means that undefined /= Parser undefined. When your Applicative code pattern-matches on Parser x, it forces a ⊥ value if the constructor.

When you pattern-match on a data constructor, it's evaluated and taken apart to make sure it's not ⊥. For example:

λ> data Foo = Foo Int deriving Show
λ> case undefined of Foo _ -> True
*** Exception: Prelude.undefined

So pattern matching on a data constructor is strict, and will force it. A newtype, on the other hand, is represented in exactly the same way as the type its constructor wraps. So matching on a newtype constructor does absolutely nothing:

λ> newtype Foo = Foo Int deriving Show
λ> case undefined of Foo _ -> True
True

There are probably two ways to change your data program such that it doesn't crash. One would be to use an irrefutable pattern match in your Applicative instance, which will always "succeed" (but using the matched values anywhere later might fail). Every newtype match behaves like an irrefutable pattern (since there's no constructor to match on, strictness-wise).

λ> data Foo = Foo Int deriving Show
λ> case undefined of ~(Foo _) -> True
True

The other would be to use Parser undefined instead of undefined:

λ> case Foo undefined of Foo _ -> True
True

This match will succeed, because there is a valid Foo value that's being matched on. It happens to contained undefined, but that's not relevant since we don't use it -- we only look at the topmost constructor.

In addition to all the links you gave, you might find this article relevant.

answered Oct 20 '22 01:10

shachaf

Related questions
                            
                                Function Composition in R (and high level functions)
                            
                                Is there no standard (Either a) monad instance?
                            
                                What's the cleanest way to do case-insensitive parsing with Text.Combinators.Parsec?
                            
                                Using case for a multi-way if
                            
                                Killing a Haskell binary
                            
                                Xmonad toggle fullscreen / xmobar
                            
                                Should I prefer MonadUnliftIO or MonadMask for bracketting like functions?
                            
                                What general structure does this type have?
                            
                                Is there something like `map2 :: (i -> a) -> (i -> b) -> [i] -> [(a,b)]`?
                            
                                How to unpack a haskell existential type?
                            
                                What's the way to determine if an Int is a perfect square in Haskell?
                            
                                Catamorphism and tree-traversing in Haskell
                            
                                Haskell pattern match on type
                            
                                Removing syntactic sugar: List comprehension in Haskell
                            
                                Haskell, understanding a solution for euler #3
                            
                                Comparing Haskell and Scala Bind/Flatmap Examples
                            
                                Which event-driven applications are implemented in Haskell?
                            
                                Optimizing Haskell code
                            
                                Defining variables inside a function Haskell
                            
                                How to tell QuickCheck to generate only valid list indices for a parameter?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With