I'm looking for a proper name for a data type that combines ADT with explicit subtyping.
In one of my applications, I use a structure similar to ADT to represent parse trees, on which I perform recursive pattern matching. I find it rather convenient if I could combine ADT with subtyping, as demonstrated in the example below:
Note: the example is written in Haskell's syntax, but this is not Haskell code.
data Empty = Empty
data Expr = Int Int | Add Expr AddOp Expr
data OptionalExpr =
| Empty // I want to make Empty a subtype of OptionalExpr
| Expr // I want to make Expr a subtype of OptionalExpr
In the example above, I first define 2 types: Empty and Expr. Then I make these 2 types the subtype of OptionalExpr. I realize this kind of data type is uncommon. Apparently neither Haskell nor OCaml support it. But I don't know about other functional languages.
I am looking for something that combines ADT with explicit subtyping, not structurally-implied subtyping as in polymorphic variant. There are a few justifications for this requirement:
data OptionalExpr = EmptyExpr | NonEmptyExpr Expr
, or we could have used Maybe
, but in my application this is unacceptable, because the level of embedding can be quite deep, and it would be a nightmare to deconstruct an deeply embedded value like (L1 (L2 (L3 (L4 (L5 value_wanted)))))
.To give you some idea why such requirements exist, I show a more specific example below:
PrimaryExpr = ID | LeftParen Expr RightParen
UnaryExpr = PrimaryExpr | NegateOp PrimaryExpr // -
MultExpr = UnaryExpr | MultExpr MultOp UnaryExpr // *
AddExpr = MultExpr | AddExpr AddOp MultExpr // +
CompExpr = AddExpr | AddExpr CompOp AddExpr
Expr = CompExpr
The above example expresses a subtype hierarchy, and expresses ideas such as AddExpr is a CompExpr, but a CompExpr is not an AddExpr. For this specific example, some people have suggested to me that I can replace UnaryExpr, MultExpr, AddExpr and so on with just Expr. That is, I can define all the types as a single type. That loses type constraints such as CompExpr is not AddExpr, and because I'm doing recursive pattern matching on these types, I need that constraints of this hierarchy to be statically enforced.
Is there a name for this kind of data type I'm looking for in the literature? Or am I looking for something that doesn't even make sense? If you think this is the case, why am I looking for something nonsensical? Thanks for any pointers.
EDIT: even though I've written the above code snippets in Haskell's syntax, I am not writing my application in Haskell. I'm using my own language and my own data types, so I am not limited by Haskell's semantics. I am looking for a pointer to similar concepts in literature, so that when I write a report for my project I don't appear to be reinventing something new. I tried all the google keywords I can think of and nothing right was returned so I'm asking here.
In a comment, you say:
I'm not sure how to encode a subtype hierarchy using GADTs. If you think it is doable, would you mind providing an answer with an example as to how the type hierarchy given in my example may be encoded?
Therefore I give an answer to this question here. The key idea is to give a type-level function (in the host language, here Haskell) for computing the subtyping relation (of the target language's type system, here your custom EDSL). For simplicity, I will spell out the subtyping relation in full, but standard type-level programming can be used to reduce the repetition and raise the abstraction level as appropriate. First, the extensions needed:
{-# LANGUAGE GADTs #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeFamilies #-}
Now the definition of the subtyping relation:
data Level = Primary | Unary | Mult | Add | Comp
type family Subtype a b where
Subtype Primary a = True
Subtype Unary Primary = False
Subtype Unary a = True
Subtype Mult Primary = False
Subtype Mult Unary = False
Subtype Mult a = True
Subtype Add Add = True
Subtype Add Comp = True
Subtype Add a = False
Subtype Comp Comp = True
Subtype Comp a = False
A closed type family is used to guarantee that the subtyping relation cannot be expanded by clients (your second property). Finally, the GADT for target language terms can use the subtyping relation as a constraint on its constructors.
data Expr a where
ID :: Subtype Primary a ~ True => Expr a
Paren :: Subtype Primary a ~ True => Expr b -> Expr a
Negate :: Subtype Unary a ~ True => Expr Unary -> Expr a
Times :: Subtype Add a ~ True => Expr Mult -> Expr Mult -> Expr a
Plus :: Subtype Add a ~ True => Expr Add -> Expr Add -> Expr a
Compose :: Subtype Comp a ~ True => Expr Comp -> Expr Comp -> Expr a
Note that because the argument to Paren
is polymorphic, you will need a type annotation on the contained term to express which "level" of the subtyping hierarchy you want that term to be treated as. I would expect you would need to do this in whatever language you are designing as well. In ghci, we can ask for the type of a sample term:
:t Compose (Times ID ID) (Negate (Paren (Plus ID ID :: Expr Add)))
Compose (Times ID ID) (Negate (Paren (Plus ID ID :: Expr Add)))
:: (Subtype 'Comp a ~ 'True) => Expr a
This is more or less the type you would expect for this term, I think. You can also see that the expression hierarchy is strictly enforced, though I dare say the error message is not 100% clear (since it is written in host language terms and not target language terms):
:t Negate (Plus ID ID)
<interactive>:1:9:
Couldn't match type ‘'False’ with ‘'True’
Expected type: 'True
Actual type: Subtype 'Add 'Unary
In the first argument of ‘Negate’, namely ‘(Plus ID ID)’
In the expression: Negate (Plus ID ID)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With