Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to re-implement `Enum` deriving using GHC generics

Is it possible to re-implement deriving of Enum type class using GHC generics?

At first, it looks easy:

data Foo -- representation without metadata (wrong):
  = Foo  -- L1 U1
  | Bar  -- R1 (L1 U1)
  | Baz  -- R1 (R1 (L1 U1))
  | Quux -- R1 (R1 (R1 U1))
  deriving (Show, Eq, Generic)

-- Rep Foo p = (U1 :+: (U1 :+: (U1 :+: U1))) p

instance Enum Foo where
  toEnum   = undefined -- FIXME
  fromEnum = gfromEnum . from

class GEnum f where
  gfromEnum :: f p -> Int

instance GEnum U1 where
  gfromEnum U1 = 0

instance GEnum f => GEnum (M1 i t f) where
  gfromEnum (M1 x) = gfromEnum x

instance (GEnum x, GEnum y) => GEnum (x :+: y) where
  gfromEnum (L1 x) = gfromEnum x
  gfromEnum (R1 y) = 1 + gfromEnum y

However, this is not going to work:

λ> fromEnum Foo
0
λ> fromEnum Bar
1
λ> fromEnum Baz
1
λ> fromEnum Quux
2

This is because we can't rely on how arguments of (:+:) are grouped. In this case it seems they are nested like this:

((U1 :+: U1) :+: (U1 :+: U1)) p

So, is it possible to derive Enum using Generics? If yes, how?

like image 378
Mark Karpov Avatar asked Jan 22 '17 16:01

Mark Karpov


2 Answers

GHC derives Generic such that the L and R variants form a tree where the leaves are in Enum order. Consider the following example (with trimmed output):

ghci> data D = A | B | C | D | E deriving (Generic)
ghci> from A
L1 (L1 U1)
ghci> from B
L1 (R1 U1)
ghci> from C
R1 (L1 U1)
ghci> from D
R1 (R1 (L1 U1))
ghci> from E
R1 (R1 (R1 U1)))

Notice that if you arranged these as a tree, toEnum `map` [1..] is going to be the left to right traversal of the leaves. With that intuition, we'll start by defining a GLeaves class which counts the number of leaves that a generic type (not a value!) has in its tree.

{-# LANGUAGE ScopedTypeVariables, PolyKinds, TypeApplications, TypeOperators,
             DefaultSignatures, FlexibleContexts, TypeFamilies #-}

import GHC.Generics
import Data.Proxy

class GLeaves f where
  -- | Counts the number of "leaves" (i.e. U1's) in the type `f`
  gSize :: Proxy f -> Int

instance GLeaves U1 where
  gSize _ = 1

instance GLeaves x => GLeaves (M1 i t x) where
  gSize _ = gSize (Proxy :: Proxy x)

instance (GLeaves x, GLeaves y) => GLeaves (x :+: y) where
  gSize _ = gSize (Proxy :: Proxy x) + gSize (Proxy :: Proxy y)

Now, we are in shape to define GEnum. As is usual with this setup, we define our class Enum' and have default signatures that rely on GEnum.

class Enum' a where
  toEnum' :: Int -> a
  fromEnum' :: a -> Int

  default toEnum' :: (Generic a, GEnum (Rep a)) => Int -> a
  toEnum' = to . gToEnum

  default fromEnum' :: (Generic a, GEnum (Rep a)) => a -> Int
  fromEnum' = gFromEnum . from

class GEnum f where
  gFromEnum :: f p -> Int
  gToEnum :: Int -> f p

Finally, we get to the good stuff. For U1 and M1, gFromEnum and gToEnum are both straightforward. For :+:, gFromEnum needs to find all of the leaves to the left of it, so if it is the right subtree we add the size of the left subtree (and if it is the left subtree we add nothing). Similarly, gToEnum, checks whether it belong in the left or right subtree by checking if it is smaller than the number of leaves in the left subtree.

instance GEnum U1 where
  gFromEnum U1 = 0

  gToEnum n = if n == 0 then U1 else error "Outside enumeration range"

instance GEnum f => GEnum (M1 i t f) where
  gFromEnum (M1 x) = gFromEnum x

  gToEnum n = M1 (gToEnum n)

instance (GLeaves x, GEnum x, GEnum y) => GEnum (x :+: y) where
  gFromEnum (L1 x) = gFromEnum x
  gFromEnum (R1 y) = gSize (Proxy :: Proxy x) + gFromEnum y

  gToEnum n = let s = gSize (Proxy :: Proxy x)
              in if n < s then L1 (gToEnum n) else R1 (gToEnum (n - s))

Finally, you can test this in GHCi:

ghci> :set -XDeriveAnyClass -XDeriveGeneric
ghci> data D = A | B | C | D | E deriving (Show, Generic, Enum, Enum')
ghci> toEnum `map` [0 .. 4] :: [D]
[A,B,C,D,E]
ghci> toEnum' `map` [0 .. 4] :: [D]
[A,B,C,D,E]
ghci> fromEnum `map` [A .. E] :: [Int]
[A,B,C,D,E]
ghci> fromEnum' `map` [A .. E] :: [Int]
[A,B,C,D,E]

Performance

You may be thinking to yourself: this is super inefficient! We end up recalculating a bunch of sizes over and over - the worst case performance is at least O(n^2). The catch is that (hopefully), GHC will be able to optimize/inline the hell out of our specific Enum' instances until there is nothing left of the initial Generic structure.

like image 96
Alec Avatar answered Sep 28 '22 04:09

Alec


Enum is one of many examples that are slightly awkward to write using the standard GHC Generics representation, because a lot of the structure of datatypes is left implicit (e.g. how sum and product constructors are nested, and where metadata occurs).

With generics-sop, you can (re-)define generic Enum instances in a slightly more straight-forward way:

{-# LANGUAGE ConstraintKinds, DataKinds, DeriveGeneric #-}
{-# LANGUAGE FlexibleContexts, GADTs, PolyKinds #-}
{-# LANGUAGE TypeApplications, TypeOperators #-}

import Generics.SOP
import qualified GHC.Generics as GHC

We define a type synonym that captures what it means to be an enumeration type:

type IsEnumType a = All ((~) '[]) (Code a)

(Unfortunately, the (~) '[] construction triggers a bug in GHC 8.0.1, but it works fine in GHC 8.0.2.) In generics-sop, the code of a datatype is a type-level list of lists. The outer list contains an element for each constructor, the inner lists contain the types of the constructor arguments. The IsEnumType constraint says that all of the inner lists have to be empty, which means that none of the constructors must have any arguments.

gfromEnum :: (Generic a, IsEnumType a) => a -> Int
gfromEnum = conIndex . unSOP . from
  where
    conIndex :: NS f xs -> Int
    conIndex (Z _) = 0
    conIndex (S i) = 1 + conIndex i

The function from turns a value into a sum-of-products representation, and unSOP strips the outer constructor. We then have a sum structure to traverse. The NS datatype representing n-ary sums has constructors Z and S that indicate exactly which constructor is being used, so we can simply traverse and count.

gtoEnum :: (Generic a, IsEnumType a) => Int -> a
gtoEnum i =
  to (SOP (apInjs_NP (hcpure (Proxy @ ((~) '[])) Nil) !! i))

Here, the apInjs_NP (hcpure ...) call produces the representations of empty constructor applications for all constructors of the datatype. Unlike the gfromEnum function, this actually makes use of the IsEnumType constraint to be type correct (because we rely on the fact that none of the constructors take any arguments). We then selected the i-th constructor out of the list and turn it back from the generic representation to the actual type by applying first SOP and then to.

To apply it to your sample type, you have to instantiate it to both GHC's and generic-sop's Generic classes (or you can use TH for this, too):

data Foo = Foo | Bar | Baz | Quux
  deriving (Show, Eq, GHC.Generic)

instance Generic Foo

Then you can test it:

GHCi> gfromEnum Baz
2
GHCi> gtoEnum 2 :: Foo
Baz

If you want, you can make gfromEnum and gtoEnum the default definitions for an Enum-like class, just as with GHC Generics.

like image 32
kosmikus Avatar answered Sep 28 '22 04:09

kosmikus