Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generics.SOP equivalent of everywhere/mkT (replacing products)

Tags:

haskell

Are there examples of generics-sop that mimic SYB's everywhere/mkT behavior?

What I'm attempting to do, but not seeing how to do it successfully, is replace the everywhere (mkT fixupSymbol) in main with an equivalent Generics.SOP construction, i.e., use Generics.SOP to recurse into the product (I (AbsAddr value)) and replace it with (I (SymAddr label)).

I could pass the symbol table to gformatOperands, polluting the formatOperands signature. That seems suboptimal.

Without fixupSymbol, the output would look like:

LD   B, 0x0000
LD   C, 0x1234
CALL 0x4567

Resolving addresses to symbolic labels:

gensop % stack ghci
Using main module: 1. Package `gensop' component exe:gensop with main-is file: <...>/Main.hs
gensop-0.1: configure (exe)
Configuring gensop-0.1...
gensop-0.1: initial-build-steps (exe)
Configuring GHCi with the following packages: gensop
GHCi, version 8.6.3: http://www.haskell.org/ghc/  :? for help
[1 of 1] Compiling Main             ( <...>/Main.hs, interpreted )
*Main> main
LD   B, 0x0000
LD   C, label1
CALL label2
*Main>

Cut down version of code:

{-# LANGUAGE DataKinds            #-}
{-# LANGUAGE TypeFamilies         #-}
{-# LANGUAGE TemplateHaskell      #-}
{-# LANGUAGE DeriveDataTypeable #-}

module Main where

import Data.Data
import Data.Foldable (foldl)
import Data.Word (Word8, Word16)
import Data.Text (Text)
import qualified Data.Text as T
import qualified Data.Text.IO as T
import Text.Printf
import Generics.SOP
import Generics.SOP.TH (deriveGeneric)
import Data.Generics.Aliases (mkT)
import Data.Generics.Schemes (everywhere)
import Data.HashMap.Strict (HashMap)
import qualified Data.HashMap.Strict as H
import Data.Sequence (Seq, (|>))
import qualified Data.Sequence as Seq


type Z80addr = Word16
type Z80word = Word8

class Z80operand x where
  formatOperand :: x -> Text

main :: IO()
main = mapM_ T.putStrLn (foldl printIns Seq.empty $ everywhere (mkT fixupSymbol) insnSeq)
-- -------------------------------------------------^ Does this have a Generics.SOP equivalent?
  where
    printIns accum ins = accum |> T.concat ([mnemonic, gFormatOperands] <*> [ins])

    mnemonic (LD _)   = "LD   "
    mnemonic (CALL _) = "CALL "

    -- Generics.SOP: Fairly straightforward
    gFormatOperands {-elt-} =
      T.intercalate ", " . hcollapse . hcmap disOperandProxy (mapIK formatOperand) . from {-elt-}
      where
        disOperandProxy = Proxy :: Proxy Z80operand

    -- Translate an absolute address, generally hidden inside an instruction operand, into a symbolic address
    -- if present in the symbol table.
    fixupSymbol addr@(AbsAddr absAddr) = maybe addr SymAddr (absAddr `H.lookup` symtab)
    fixupSymbol other                  = other

    insnSeq :: Seq Z80instruction
    insnSeq = Seq.singleton (LD (Reg8Imm B 0x0))
              |> (LD (Reg8Indirect C (AbsAddr 0x1234)))
              |> (CALL (AbsAddr 0x4567))

    symtab :: HashMap Z80addr Text
    symtab = H.fromList [ (0x1234, "label1"), (0x4567, "label2")]

-- | Symbolic and absolute addresses. Absolute addresses can be translated into symbolic
-- labels.
data SymAbsAddr  = AbsAddr Z80addr | SymAddr Text
  deriving (Eq, Ord, Typeable, Data)

data Z80reg8 = A | B | C
  deriving (Eq, Ord, Typeable, Data)

-- | Cut down version of the Z80 instruction set
data Z80instruction = LD OperLD | CALL SymAbsAddr
  deriving (Eq, Ord, Typeable, Data)

-- | Load operands
data OperLD = Reg8Imm Z80reg8 Z80word | Reg8Indirect Z80reg8 SymAbsAddr
  deriving (Eq, Ord, Typeable, Data)

$(deriveGeneric ''SymAbsAddr)
$(deriveGeneric ''Z80reg8)
$(deriveGeneric ''Z80instruction)
$(deriveGeneric ''OperLD)

instance Z80operand Z80word where
  formatOperand word = T.pack $ printf "0x%04x" word

instance Z80operand SymAbsAddr where
  formatOperand (AbsAddr addr)  = T.pack $ printf "0x04x" addr
  formatOperand (SymAddr label) = label

instance Z80operand Z80reg8 where
  formatOperand A = "A"
  formatOperand B = "B"
  formatOperand C = "C"

instance Z80operand OperLD where
  formatOperand (Reg8Imm reg imm) = T.concat [formatOperand reg, ", ", formatOperand imm]
  formatOperand (Reg8Indirect reg addr) = T.concat [formatOperand reg, ", ", formatOperand addr]

The gensop.cabal file:

cabal-version:  >= 1.12
name:           gensop
version:        0.1
build-type:     Simple
author:         scooter-me-fecit
description:    No description.
license:        GPL-3

executable gensop
  default-language:     Haskell2010
  main-is: Main.hs
  build-depends:
    base,
    containers,
    bytestring,
    generics-sop,
    syb,
    text,
    unordered-containers

  default-extensions:
    OverloadedStrings,
    FlexibleInstances

  ghc-options: -Wall
like image 248
scooter me fecit Avatar asked Feb 26 '19 01:02

scooter me fecit


1 Answers

generics-sop provides no equivalents for recursive traversal schemes, such as these functions. If you need to deal with recursion in this library, the possible solution is to implement them. Although, defining such functions in SOP is related to some difficulties because it has a core generic view on data that doesn't distinguish recursive nodes from leaves. Recursion in this setting can be managed using closed type families (CTF) and some type class machinery. Closed type families allow you:

  1. to implement the type-safe cast, which is needed for defining mkT,
  2. to resolve the cases of recursive and non-recursive nodes—different instances of a type class—which otherwise overlap. (Another option is using pragmas for overlapping instances, a recent GHC feature; there is, however, some bias about overlapping instances in the Haskell community, so this solution is often considered as undesired.)

Using CTF for treating recursion has been described in an unpublished paper “Handling Recursion in Generic Programming Using Closed Type Families”, which uses the generics-sop library as a case study; it provides examples of defining recursive schemes in SOP.

SYB's everywhere supports families of mutually recursive datatypes. The following implementation allows to specify them as type-level lists.

{-# LANGUAGE DeriveGeneric, TypeFamilies, DataKinds,
             TypeApplications, ScopedTypeVariables, MultiParamTypeClasses,
             ConstraintKinds, FlexibleContexts, AllowAmbiguousTypes,
             FlexibleInstances, UndecidableInstances,
             UndecidableSuperClasses, TypeOperators, RankNTypes #-}

import Generics.SOP
import Generics.SOP.NS

import GHC.Exts (Constraint)
import Data.Type.Equality

type family Equal a x :: Bool where
  Equal a a = 'True
  Equal _ _ = 'False

class DecideEq (eq :: Bool) (a :: *) (b :: *) where
  decideEq :: Maybe (b :~: a)
instance a ~ b => DecideEq True a b where
  decideEq = Just Refl
instance DecideEq False a b where
  decideEq = Nothing

type ProofCast a b = DecideEq (Equal a b) a b

castEq :: forall a b. ProofCast a b => b -> Maybe a
castEq t = (\d -> castWith d t) <$> decideEq @(Equal a b)

type Transform a b = (Generic a, Generic b, ProofCast a b, ProofCast b a)

mkT :: Transform a b => (a -> a) -> b -> b
mkT f x = maybe x id $ castEq =<< f <$> castEq x

type family In (a :: *) (fam :: [*]) :: Bool where
    In a   ([a] ': fam) = 'True
    In [a] (a   ': fam) = 'True
    In a   (a   ': fam) = 'True
    In a   (_   ': fam) = In a fam
    In _   '[]          = 'False

class CaseEverywhere' (inFam :: Bool) (c :: * -> Constraint)
                      (fam :: [*]) (x :: *) (y :: *) where
  caseEverywhere' :: (forall b . c b => b -> b) -> I x -> I y

instance c x => CaseEverywhere' 'False c fam x x where
  caseEverywhere' f = I . f . unI
instance (c x, Everywhere x c fam) => CaseEverywhere' 'True c fam x x where
  caseEverywhere' f = I . f . everywhere @fam @c f . unI

class    CaseEverywhere' (In x fam) c fam x y => CaseEverywhere c fam x y
instance CaseEverywhere' (In x fam) c fam x y => CaseEverywhere c fam x y

caseEverywhere :: forall c fam x y . CaseEverywhere c fam x y
               => (forall b . c b => b -> b) -> I x -> I y
caseEverywhere = caseEverywhere' @(In x fam) @c @fam

type Everywhere a c fam =
  (Generic a, AllZip2 (CaseEverywhere c fam) (Code a) (Code a))

everywhere :: forall fam c a . Everywhere a c fam
           => (forall b . c b => b -> b) -> a -> a
everywhere f = to . everywhere_SOP . from
  where
    everywhere_SOP = trans_SOP (Proxy @(CaseEverywhere c fam)) $
                               caseEverywhere @c @fam f

Usage  First, this can be examined with a small-scale example taken from the SYB paper. The implemented SOP-based everywhere, as compared to SYB's one, additionally takes two type arguments, passed through the explicit type application. The first one specifies a family of mutually recursive datatypes as a type list. The traversal will treat as recursive only those nodes whose types are specified in that list. The second argument is needed to provide a compiler with a ‘proof’ object for type-cast. The T synonym for the Transform constraint serves to allow its partial application.

data Company = C [Dept]
data Dept = D Name Manager [SubUnit]
data SubUnit = PU Employee | DU Dept
data Employee = E Person Salary
data Person = P Name Address
data Salary = S Float
type Manager = Employee
type Name = String
type Address = String

class    Transform a b => T a b
instance Transform a b => T a b

type CompanyF = '[Company, Dept, SubUnit, Employee]

increase :: Float -> Company -> Company
increase k = everywhere @CompanyF @(T Salary) (mkT (incS k))

incS :: Float -> Salary -> Salary
incS k (Sal s) = Sal (s * (1 + k))

The defined everywhere / mkT functions are ready for using in your code, but it misses some Generic instances. To apply everywhere to insnSeq, you need a Generic (Seq Z80instruction) instance. Yet you can't obtain it, because the Data.Sequence module doesn't export the internal representation of it. A possible fix is applying fmap to the sequence. So now you can write:

{-# LANGUAGE TypeApplications #-}

...

type Z80 = '[SymAbsAddr, Z80reg8, Z80instruction, OperLD]

main :: IO()
main = mapM_ T.putStrLn (foldl printIns Seq.empty $
  fmap (everywhere @Z80 @(T SymAbsAddr) (mkT fixupSymbol)) insnSeq)

You should provide the Generic instances for all types of nodes that this traverses, recursive and non-recursive. So next, this demands the Generic instances for Word8, Word16, and Text. While the Generic Text instance can be generated via deriveGeneric, the others can't, because of their special GHC representation. So you'll have to do it manually; this definition is straightforward:

$(deriveGeneric ''Text)

instance Generic Word8 where
  type Code Word8 = '[ '[Word8]]
  from x                        = SOP (Z (I x :* Nil))
  to   (SOP ((Z (I x :* Nil)))) = x

instance Generic Word16 where
  type Code Word16 = '[ '[Word16]]
  from x                        = SOP (Z (I x :* Nil))
  to   (SOP ((Z (I x :* Nil)))) = x

This code is boilerplate, but the newest GHC extension DerivingVia could nicely simplify this, reducing the second definition. Hopefully, this useful feature will be improved with possibilities for standalone deriving, so it will be possible to say instead:

deriving via Word8 instance Generic Word16

The whole code now works well, and main yields the expected result.

like image 131
Maryann Avatar answered Oct 12 '22 08:10

Maryann