I want to implement a generic hierarchy for tree structures, which can later be used in an implementation-independent way to describe generic algorithms over trees.
I started with this hierarchy:
interface BinaryTree<Node> {
Node left(Node);
bool hasLeft(Node);
Node right(Node);
bool hasRight(Node);
}
interface BinaryTreeWithRoot<Node> : BinaryTree<Node> {
Node root();
}
interface BinaryTreeWithParent<Node> : BinaryTree<Node> {
Node parent(Node);
bool hasParent(Node);
}
Now, basically I want to be able to implement the concept of a subtree in a universal way: For each class T : BinaryTree, I want a 'class' Subtree(T) which provides the same functionality of T (so it must derive from it), and also rewrites the root() functionality.
Something like this:
class Subtree<T, Node> : T, BinaryTreeWithRoot<Node>
where T : BinaryTree<Node>
{
T reference;
Node root;
void setRoot(Node root) {
this.root = root;
}
override Node BinaryTreeWithRoot<Node>::root() {
return this.root;
}
// Now, inherit all the functionality of T, so an instance of this class can be used anywhere where T can.
forall method(arguments) return reference.method(arguments);
}
Now with this code I'm not sure how to create an object of type subtree, since the tree object should in some way be injected.
One approach is to create a subtree class for each tree class i create, but this means code duplication, and, after all, its the same thing.
So, one approach to this is mixins, which allow a generic class to derive from its template parameter.
I'm also interested how such a hierarchy can be implemented in Haskell, since Haskell has a great type system and I think it will be easier to inject such functionality.
For example in Haskell it may be something like:
class BinaryTree tree node where
left :: tree -> node -> node
right :: tree -> node -> node
class BinaryTreeWithRoot node where
left :: tree -> node -> node
right :: tree -> node -> node -- but this is a duplication of the code of BinaryTree
root :: tree -> node
instance BinaryTree (BinaryTreeWithRoot node) where
left = left
right = right
data (BinaryTree tree node) => Subtree tree node =
...
instance BinaryTreeWithRoot (Subtree tree node) where ...
I'm interested if and how this can be done in within an oop language (c++,c#,d,java), as c++ and d provide mixins out of the box (and i'm not sure for d), and out of curiosity with the Haskell type system.
Since D has "real" templates, not generics, making a template class inherit from its template parameter is trivial:
class A {}
class B(T) : T {
static assert(is(B!T : T)); // Passes.
}
As far as making Subtree
work in D, something like this should do it, assuming you also have a template class Node
:
class Subtree(T) : T, BinaryTreeWithRoot!(Node!(T))
{
T reference;
Node root;
void setRoot(Node root) {
this.root = root;
}
override Node root() {
return this.root;
}
}
However, IIUC (correct me if I'm wrong), T
is the payload of the tree and could therefore be a primitive. If that's the case, you would be better off getting your ability to use a Subtree!(T)
as a T
via alias this
, which allows for subtyping without inheritance and works with primitives:
class Subtree(T) : BinaryTreeWithRoot!(Node!(T))
{
T reference;
alias reference this; // Make this implicitly convertible to reference.
Node root;
void setRoot(Node root) {
this.root = root;
}
override Node root() {
return this.root;
}
}
Creating a tree interface like this in Haskell is ... unusual. Both Node
and Subtree
are superfluous. This is partially due to algebraic types, and partially because Haskell data is immutable so different techniques are required to accomplish certain things (like setting the root node). It is possible to do it, the interface would look something like:
class BinaryTree tree where
left :: tree a -> Maybe (tree a)
right :: tree a -> Maybe (tree a)
-- BinaryTreeWithRoot inherits the BinaryTree interface
class BinaryTree tree => BinaryTreeWithRoot tree where
root :: tree a -> tree a
Then, with a pretty standard binary tree definition:
data Tree a =
Leaf
| Branch a (Tree a) (Tree a)
instance BinaryTree Tree where
left Leaf = Nothing
left (Branch _ l r) = Just l
right Leaf = Nothing
right (Branch _ l r) = Just r
data TreeWithRoot a =
LeafR (TreeWithRoot a)
| BranchR a (TreeWithRoot a) (TreeWithRoot a) (TreeWithRoot a)
instance BinaryTree TreeWithRoot where
-- BinaryTree definitions omitted
instance BinaryTreeWithRoot TreeWithRoot where
root (LeafR rt) = rt
root (BranchR _ rt l r) = rt
Since this interface returns a Maybe (tree a)
, you can also use left
and right
to check if the branches exist instead of using separate methods.
There's nothing particularly wrong with it, but I don't believe I've ever seen anyone actually implement this approach. The more usual techniques are either to define traversals in terms of Foldable
and Traversable
or create a zipper. Zippers are simple to derive manually, but there are several generic zipper implementations, such as zipper, pez, and syz.
In C# 4 I would use dynamics for achieving this goal. For example you could try defining the SubtTree class as:
public class Subtree<T, Node> : DynamicObject, BinaryTreeWithRoot<Node> where T : BinaryTree<Node>
{
private readonly T tree;
public Subtree(T tree)
{
this.tree = tree;
}
}
and override appropriate methods of DynamicObject using methods/properties of tree. More information (and sample code) can be found in this great blog post about Using C# 4.0 dynamic to drastically simplify your private reflection code.
It is worth mentioning that due to usage of dynamic capabilities and reflection, small performance overhead will be introduced as well as reduced safety (as it may involve violation of encapsulation).
I think the approach through "BinaryTree"s assumes too much of a fixed structure and unnecessarily defines your interface in a nongeneric way. Doing this makes it difficult to reuse algorithms as your tree expands into non-binary forms. Instead, you will need to code your interfaces for multiple styles when such is not necessary or useful.
Also, coding with hasLeft/hasRight checks means every access is a two step process. Checking existence of a fixed position will not provide for efficient algorithms. Instead, I think you will find that adding a generic property that may be binary left/right or binary red/black or a character index or anything else will allow much more reuse of your algorithms and checking that data can be done only by those algorithms that need it (specific binary algorithms).
From a semantic view, you want to encode some basic properties first, and then specialise. When you are "at" a node inside an algorithm, you want to be able to first find the children edges. This should be a container range of edge structures that allow you to navigate to the children nodes. Since it can be a generic container, it could have 0, 2, 5, 1, or even a 100 edges in it. Many algorithms do not care. If it has 0, iterating over the range will do nothing - no need to check hasX or hasY. For each edge, you should be able to get the node of the child, and recurse for whatever algorithm you wish.
This is basically the approach boost takes in it's Graph library, and it allows for expansion of tree algorithms to graphs where they are applicable, for even better generic algorithm reuse.
So already you have a basic interface with this
TreeNode:
getChildEdges: () -> TreeEdgeRange
TreeEdge:
getChildNode: () -> TreeNode
and whatever range-to-object your favorite language enjoys. D has a particularly useful range syntax, for instance.
You will want to have some basic Tree object that gives you nodes. Something like
Tree:
getTreeNodes: () -> TreeNodeRange
starts you out.
Now, if you want to support BinaryTrees, do so as a restriction on this interface. Note that you don't really need any new interface methods, you just need to enforce more invariants - that every TreeNode has 0, 1, or 2 childEdges. Just make an interface type that indicates this semantic restriction:
BinaryTree : Tree
And if you want to support rooted trees, adding an interface layer with
RootedTree : Tree:
getRoot: () -> TreeNode
adds that capability.
The basic idea is that you shouldn't have to add interface methods to add semantic requirements if you are making your classes more specific down the hierarchy. Only add interface methods if there is a new semantic behavior that needs accessed. Otherwise - enforce new invariants on the generic interface.
Eventually, you'll want to decorate nodes and edges with structures that hold data about the node or edge, so you can build Tries and Red-Black trees and all the great tools of advanced algorithmics. So you will want to have
PropertiedTreeNode<Property> : TreeNode:
getProperty: () -> Property
PropertiedTreeEdge<Property> : TreeEdge:
getProperty: () -> Property
Since this is something you will want to allow generic algorithms to work on, the type information of whether a property is a part of the Tree or not should be generic and something algorithms can ignore. This puts you on the design track of boost, where these issues have been resolved very elegantly. I would recommend studying that library if you want ideas on how to build a generic tree algorithm library.
If you follow the above guidelines of types-equating-to-semantic-descriptions, then thee SubTree should be obvious - it's exactly the same type as the tree it is coming from! In fact, you really should not have a SubTree type at all. Instead, you should just have a method of the specific TreeNode type you are dealing with
PropertiedTreeNode<Property>:
getSubTree: () -> PropertiedTree<Property>
And, as in boost, as you encode more of the information of the capabilities of Tree in it's generic properties, you can get new Tree types with broader interface contracts.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With