Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How (or why) is `Data.Set String` not a single type?

I am studying Haskell the hard way, by trying to write something I find interesting, and right now I'm trying to figure out how to derive a Semiring in Haskell for a specific set of parsing problems:

class Semiring s where
    zero, one :: s
    mul, add  :: s -> s -> s

instance Semiring Bool where
    zero = False
    one = True
    add = (||)
    mul = (&&)

instance Semiring (Set String) where
    zero    = empty 
    one     = singleton ""
    add a b = union a b
    mul a b = Data.Set.map (\(c, d) -> c ++ d) $ cartesianProduct a b

The Bool ({true, false}, ∨, ∧, false, true) version works great. So does an Int version. The last one is called a Parse Forest, and its representation is (E, ∪, ·, ∅, {<>}), where E is a set of strings and the {<>} is the set of the empty string.

When I try to compile this, I get:

Rigge…   115  10 error           • Illegal instance declaration for ‘Semiring (Set String)’
(All instance types must be of the form (T a1 ... an)
where a1 ... an are *distinct type variables*,
and each type variable appears at most once in the instance head.

Which doesn't make that much sense to me. Set String is a distinct type, right, and all of the operations of the class Semiring should be expressible purely in terms of sets of strings.

If you want context, the project is at Rigged Regular Expressions. The Bool version merely reports back that the regex matched; an Int version reports back the number of different ways a regex could have matched (i.e "a" ~ /(a|a*)/ will return 2 because two distinct and unique subexpressions match); the ParseForest should return not the number of ways, but the set of all possible ways— but it can't, because I don't understand why I can't use a concrete data type, Set String, where another concrete data type like Int or Bool worked okay.

like image 874
Elf Sternberg Avatar asked Nov 28 '22 16:11

Elf Sternberg


1 Answers

chi's answer describes how to do this by turning on an extension, which is perfectly fine. But if you wonder how anyone could get by without this extension, there are a couple approaches.

The simplest change would be to introduce a newtype wrapper, to explicitly get rid of the type variable yourself before defining an instance.

newtype StringSet = StringSet (Set String)
instance Semiring StringSet where {...}

But of course this feels somewhat clunky and primitive.

Alternatively, it seems to me that you don't need to be so specific as String: your instance would work for any Monoid type, wouldn't it?

instance (Ord a, Monoid a) => Semiring (Set a) where
  zero = empty
  one = singleton mempty
  add = union
  mul a b = Data.Set.map (uncurry (<>)) $ cartesianProduct a b
like image 68
amalloy Avatar answered Dec 05 '22 04:12

amalloy