Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

non-lawful Monoid instances for building up AST not considered harmful?

I've seen a data type defined like the following with a corresponding Monoid instance:

data Foo where
  FooEmpty :: String -> Foo
  FooAppend :: Foo -> Foo -> Foo

-- | Create a 'Foo' with a specific 'String'.
foo :: String -> Foo
foo = FooEmpty

instance Monoid Foo where
  mempty :: Foo
  mempty = FooEmpty ""

  mappend :: Foo -> Foo -> Foo
  mappend = FooAppend

You can find the full code in a gist on Github.

This is how Foo can be used:

exampleFoo :: Foo
exampleFoo =
  (foo "hello" <> foo " reallylongstringthatislong") <>
  (foo " world" <> mempty)

exampleFoo ends up as a tree that looks like this:

FooAppend
  (FooAppend
    (FooEmpty "hello")
    (FooEmpty " reallylongstringthatislong"))
  (FooAppend
    (FooEmpty " world")
    (FooEmpty ""))

Foo can be used to turn sequences of Monoid operations (mempty and mappend) into an AST. This AST can then be interpreted into some other Monoid.

For instance, here is a translation of Foo into a String that makes sure the string appends will happen optimally:

fooInterp :: Foo -> String
fooInterp = go ""
  where
    go :: String -> Foo -> String
    go accum (FooEmpty str) = str ++ accum
    go accum (FooAppend foo1 foo2) = go (go accum foo2) foo1

This is really nice. It is convenient that we can be sure String appends will happen in the right order. We don't have to worry about left-associated mappends.

However, the one thing that worries me is that the Monoid instance for Foo is not a legal Monoid instance.

For instance, take the first Monoid law:

mappend mempty x = x

If we let x be FooEmpty "hello", we get the following:

mappend mempty (FooEmpty "hello") = FooEmpty "hello"
mappend (FooEmpty "") (FooEmpty "hello") = FooEmpty "hello"  -- replace mempty with its def
FooAppend (FooEmpty "") (FooEmpty "hello") = FooEmpty "hello" -- replace mappend with its def

You can see that FooAppend (FooEmpty "") (FooEmpty "hello") does not equal FooEmpty "hello". The other Monoid laws also don't hold for similar reasons.

Haskellers are usually against non-lawful instances. But I feel like this is a special case. We are just trying to build up a structure that can be interpreted into another Monoid. In the case of Foo, we can make sure that the Monoid laws hold for String in the fooInterp function.

  1. Is it ever okay to use these types of non-lawful instances to build up an AST?
  2. Are there any specific problems that need to be watched for when using these types of non-lawful instances?
  3. Is there an alternative way to write code that uses something like Foo? Some way to enable interpretation of a monoidal structure instead of using mappend on a type directly?
like image 309
illabout Avatar asked Aug 25 '17 15:08

illabout


2 Answers

Quoting this answer on a similar question:

You can think of it from this alternative point of view: the law (a <> b) <> c = a <> (b <> c) doesn't specify which equality should be used, i.e. what specific relation the = denotes. It is natural to think of it in terms of structural equality, but note that very few typeclass laws actually hold up to structural equality (e.g. try proving fmap id = id for [] as opposed to forall x . fmap id x = id x).

For example, it's mostly fine if you do not export the constructors of Foo, and only export functions that, from the point of view of users, behave as if Foo were a monoid. But most of the time it is possible to come up with a representation that's structurally a monoid, good enough in practice, though maybe not as general (below, you cannot reassociate arbitrarily after the fact, because interpretation is mixed with construction).

type Foo = Endo String

foo :: String -> Foo
foo s = Endo (s <>)

unFoo :: Foo -> String
unFoo (Endo f) = f ""

(Data.Monoid.Endo)


Here is another SO question where a non-structural structure (Alternative) is considered at first.

like image 129
Li-yao Xia Avatar answered Oct 01 '22 22:10

Li-yao Xia


This will come up for most non-trivial data structures. The only exceptions I can think of off the top of my head are (some) trie-like structures.

  1. Balanced tree data structures allow multiple balancings of most values. This is true of AVL trees, red-black trees, B-trees, 2-3 finger trees, etc.

  2. Data structures designed around "rebuilding", such as Hood-Melville queues, allow variable amounts of duplication within structures representing most values.

  3. Data structures implementing efficient priority queues allow multiple arrangements of elements.

  4. Hash tables will arrange elements differently depending on when collisions occur.

None of these structures can be asymptotically as efficient without this flexibility. The flexibility, however, always breaks laws under the strictest interpretation. In Haskell, the only good way to deal with this is by using the module system to make sure no one can detect the problem. In experimental dependently typed languages, researchers have been working on things like observational type theory and homotopy type theory to find better ways to talk about "equality", but that research is pretty far from becoming practical.

like image 35
dfeuer Avatar answered Oct 01 '22 22:10

dfeuer