Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ADT-like polymorphism in Java (without altering class)

In Haskell I can define following data type:

data Tree = Empty
      | Leaf Int
      | Node Tree Tree

and then write polymorphic function like this:

depth :: Tree -> Int
depth Empty = 0
depth (Leaf n) = 1
depth (Node l r) = 1 + max (depth l) (depth r)

In Java I can emulate algebraic data types with interfaces:

interface Tree {}
class Empty implements Tree {}
class Leaf implements Tree { int n; }
class Node implements Tree { Tree l; Tree r; }

But if I try to use Haskell-like polymorphism, I get an error:

int depth(Empty node) {
    return 0;
}
int depth(Leaf node) {
    return 1;
}
int depth(Node node) {
    return 1 + Math.max(depth(node.l), depth(node.r));   // ERROR: Cannot resolve method 'depth(Tree)'
}

Correct way to overcome this is to put method depth() to each class. But what if I don't want to put it there? For example, method depth() may be not directly related to Tree and adding it to class would break business logic. Or, even worse, Tree may be written in 3rd party library that I don't have access to. In this case, what is the simplest way to implement ADT-like polymorpism?

Just in case, for the moment I'm using following syntax, which is obviously ill-favored:

int depth(Tree tree) {
    if (tree instanceof Empty) depth((Empty)tree)
    if (tree instanceof Leaf) depth((Leaf)tree);
    if (tree instanceof Node) depth((Node)tree); 
    else throw new RuntimeException("Don't know how to find depth of " + tree.getClass());
}
like image 549
ffriend Avatar asked Oct 05 '12 20:10

ffriend


4 Answers

Try something like this.

Sorry, my Java is very rusty. If, unlike me, you can remember the syntax, you could use Java generics to refine Object to Integer or whatever class the method you're writing needs. But you can't (can you?) return primitive types, sorry.

interface TreeFolder {
    Object onEmpty();
    Object onLeaf (int n);
    Object onNode (Tree l, Tree r);
}

interface Tree {
    Object fold (TreeFolder f);
}

class Empty implements Tree {
    Object fold (TreeFolder f) {
        return f.onEmpty();
    }
}

class Leaf implements Tree {
    private int n;
    Object fold (TreeFolder f) {
        return f.onLeaf (n);
    }
}

class Node implements Tree {
    private Tree l, r;
    Object fold (TreeFolder f) {
        return f.onNode (l, r);
    }
}

// meanwhile, in a class in another package far far away...
Object depth (Tree tree) {
    return tree.fold (new TreeFolder() {
        Object onEmpty() { return new Integer(0); }
        Object onLeaf (int n) { return new Integer(n); }
        Object onNode (Tree l, Tree r) {
            Integer ll = (Integer) l.fold (this);
            Integer rr = (Integer) r.fold (this);
            return new Integer (ll.intValue() + rr.intValue());
        }
    });
}

Note that in depth() I have to manually recurse (call fold()) on the Tree parameters. You could instead choose to recurse on them upfront in Node.fold() (and change TreeFolder accordingly), but then you have to recurse --- you can't choose to recurse only into the left subtree, should you wish to. (In Haskell we don't have to make that trade-off thanks to laziness.)

like image 122
dave4420 Avatar answered Nov 09 '22 05:11

dave4420


Here's a rough sketch of one way you could approach this, in a general and extensible way. It won't work directly in all cases, but might help you get started.

First, some starting assumptions:

  1. We don't want anything specific to depth added to the Tree classes.
  2. We don't want to lose the benefits of static types.

The key point is to realize that the Haskell code you want to recreate here is not the Tree type itself, but rather the pattern match on it. As such, we'll start by making "pattern matching on a tree" a first class (ha, ha) entity in its own right. Using C#-ish pseudocode, because I haven't used Java in years:

interface MatchTree<R> 
{
    R matchEmpty(Empty empty);
    R matchLeaf(Leaf leaf);
    R matchNode(Node node);
}

To use this reified pattern match, we need an appropriate method on Tree:

interface Tree
{
    R patternMatch<R>(MatchTree<R> patterns);
}

Each individual Tree subtype can then implement the function by calling the appropriate MatchTree method with itself as an argument.

The equivalent Haskell would be something like this:

data MatchTree r = MatchTree { matchEmpty :: r
                             , matchLeaf :: Int -> r
                             , matchNode :: Tree -> Tree -> r
                             }

...which can be easily seen to correspond directly with a case expression:

match tree z fl fn = case tree of
                       Empty -> z
                       Leaf x -> fl i
                       Node lt rt -> fn lt rt

This style of reified pattern match is known in OOP circles as the "visitor pattern", incidentally.

like image 27
C. A. McCann Avatar answered Nov 09 '22 05:11

C. A. McCann


For example, method depth() may be not directly related to Tree and adding it to class would break business logic. Or, even worse, Tree may be written in 3rd party library that I don't have access to. In this case, what is the simplest way to implement ADT-like polymorpism?

In this case - I'd suggest you to use design pattern Visitor. It allows you to separate representation of data and logic of processing, even more - it allows you to implement different processing strategies.

like image 32
stemm Avatar answered Nov 09 '22 05:11

stemm


edit: This is the answer you didn't want (put depth() in the Tree interface), but I think it deserves a full analysis anyway.


More broadly, this is the issue of implementing sum types using classes. There is a pretty common way to have sum types in object oriented languages. Namely, the interpreter pattern.

interface Tree { int depth(); }

class Empty implements Tree { int depth(){ return 0; }

class Leaf implements Tree {
  int n;
  int depth(){ return 1; }
}

class Node implements Tree {
  Tree l; Tree r;
  int depth(){ return max(depth(l), depth(r)); }
}

Let's compare this to the haskell approach! It's quite clear that the author of the classes can have arbitrarily many types (Empty, Leaf, Node) and methods (depth(), numLeafs()). However, what about a external code that wants to extend this tree library?

Using algebraic data types in haskell, an external code base can add tree functions of type :: Tree -> a if the library exposes Tree(..) (The type itself and all three constructors). However, one cannot add a new constructor to Tree, like this:

-- Code far far away can't do this in haskell
data Tree = ...
          | ...
          | Node3 Tree Tree Tree

But in java when using the interpreter pattern, it is the opposite. One cannot add a new method to the Tree interface, but one can just add a new constructor like this:

-- Code far far away *can* do this in java
class Node3 implements Tree {
  Tree l; Tree mid; Tree r;
  int depth(){ ... }
}

In conclusion, this design pattern works great if:

  • Others want to add terms to the algebraic data structure.
  • You desire full type safety

yet it is somewhat unsatisfactory because:

  • Others can't add reducer functions like numNodes()
  • It feels contrived that a method on Trees have to be put once in every class. We would prefer pattern matching a tree once per method, instead we are doing it in sort of a transposed way.
like image 5
Tarrasch Avatar answered Nov 09 '22 05:11

Tarrasch