What does GADT offer that cannot be done with OOP and generics?

Question

Are GADTs in functional languages equivalent to traditional OOP + generics, or there is a scenario where there are correctness constrants easily enforced by GADT but hard or impossible to achieve using Java or C#?

For example, this "well-typed interpreter" Haskell program:

data Expr a where
  N :: Int -> Expr Int
  Suc :: Expr Int -> Expr Int
  IsZero :: Expr Int -> Expr Bool
  Or :: Expr Bool -> Expr Bool -> Expr Bool

eval :: Expr a -> a
eval (N n) = n
eval (Suc e) = 1 + eval e
eval (IsZero e) = 0 == eval e
eval (Or a b) = eval a || eval b

can be written equivalently in Java using generics and appropriate implementation of each subclass, though much more verbose:

interface Expr<T> {
    public <T> T eval();
}

class N extends Expr<Integer> {
    private Integer n;

    public N(Integer m) {
        n = m;
    }

    @Override public Integer eval() {
        return n;
    }
}

class Suc extends Expr<Integer> {
    private Expr<Integer> prev;

    public Suc(Expr<Integer> aprev) {
        prev = aprev;
    }

    @Override public Integer eval() {
        return 1 + prev.eval()
    }
}


/** And so on ... */

Andreas Rossberg · Accepted Answer

Generics do not provide type equality constraints. Without them you need to rely on downcasts, i.e., lose type safety. Moreover, certain dispatch – in particular, the visitor pattern – cannot be implemented safely and with the proper interface for generics resembling GADTs. See this paper, investigating the very question:

Generalized Algebraic Data Types and Object-Oriented Programming
Andrew Kennedy, Claudio Russo. OOPSLA 2005.

chi · Answer

OOP classes are open, GADTs are closed (like plain ADTs).

Here, "open" means you can always add more subclasses later, hence the compiler can not assume to have access to all the subclasses of a given class. (There are a few exceptions, e.g. Java's final which however prevents any subclassing, and Scala's sealed classes). Instead, ADTs are "closed" in the sense you can not add further constructors later on, and the compiler knows that (and can exploit it to check e.g. exhaustiveness). For more information, see the "expression problem".

Consider the following code:

data A a where
  A1 :: Char -> A Char
  A2 :: Int  -> A Int

data B b where
  B1 :: Char   -> B Char
  B2 :: String -> B String

foo :: A t -> B t -> Char
foo (A1 x) (B1 y) = max x y

The above code relies on Char being the only type t for which one can produce both A t and B t. GADTs, being closed, can ensure that. If we tried to mimick this using OOP classes we fail:

class A1 extends A<Char>   ...
class A2 extends A<Int>    ...
class B1 extends B<Char>   ...
class B2 extends B<String> ...

<T> Char foo(A<T> a, B<T> b) {
      // ??
}

Here I think we can not implement the same thing unless resorting to unsafe type operations like type casts. (Moreover, these in Java don't even consider the parameter T because of type erasure.) We might think of adding some generic method to A or B to allow this, but this would force us to implement said method for Int and/or String as well.

In this specific case, one might simply resort to a non generic function:

Char foo(A<Char> a, B<Char> b) // ...

or, equivalently, to adding a non generic method to those classes. However, the types shared between A and B might be a larger set than the singleton Char. Worse, classes are open, so the set can get larger as soon as one adds a new subclass.

Also, even if you have a variable of type A<Char> you still do not know if that's a A1 or not, and because of that you can not access A1's fields except by using a type cast. The type cast here would be safe only because the programmer knows there's no other subclass of A<Char>. In the general case, this might be false, e.g.

data A a where
  A1 :: Char -> A Char
  A2 :: t -> t -> A t

Here A<Char> must be a superclass of both A1 and A2<Char>.

@gsg asks in a comment about equality witnesses. Consider

data Teq a b where
   Teq :: Teq t t

foo :: Teq a b -> a -> b
foo Teq x = x

trans :: Teq a b -> Teq b c -> Teq a c
trans Teq Teq = Teq

This can be translated as

interface Teq<A,B> {
  public B foo(A x);
  public <C> Teq<A,C> trans(Teq<B,C> x);
}
class Teq1<A> implements Teq<A,A> {
  public A foo(A x) { return x; }
  public <C> Teq<A,C> trans(Teq<A,C> x) { return x; }
}

The code above declares an interface for all the type pairs A,B, which is then implemented only in the case A=B (implements Teq<A,A>) by the class Teq1. The interface requires a conversion function foo from A to B, and a "transitivity proof" trans, which given this of type Teq<A,B> and an x of type Teq<B,C> can produce an object Teq<A,C>. This is the Java analogous of the Haskell code using GADTs right above.

The class can not be safely implemented when A/=B, as far as I can see: it would require either returning nulls or cheating with non termination.

What does GADT offer that cannot be done with OOP and generics?

Tags:

haskell

generics

ocaml

type-theory

gadt

Alex

2 Answers

Andreas Rossberg

chi

Recent Activity

Donate For Us

What does GADT offer that cannot be done with OOP and generics?

Tags:

haskell

generics

ocaml

type-theory

gadt

Alex

2 Answers

Andreas Rossberg

chi

Related questions

Recent Activity

Donate For Us