To help teach myself C++, I'm working on a red-black tree implementation. Coming from Haskell, I, thought it'd be interesting to see if I could enforce the properties of a red-black tree statically in C++'s type system:
- A node is either red or black.
- The root is black [...]
- All leaves (NIL) are black.
- If a node is red, then both its children are black.
- Every path from a given node to any of its descendant NIL nodes contains the same number of black nodes. [...]
I figured out how to craft the types for the nodes of the tree to satisfy constraints 1, 3, 4, and 5 using templates:
template <typename Key, typename Value>
class RedBlackTree {
private:
enum class color { Black, Red };
// [1. A node is either red or black]
template <color Color, size_t Height>
struct Node {};
// [3. All leaves are black]
struct Leaf : public Node<color::Black, 0> {};
template <color Left, color Right, size_t ChildHeight>
struct Branch {
public:
template <color ChildColor>
using Child = unique_ptr<Node<ChildColor, ChildHeight>>;
Key key;
Value value;
Child<Left> left;
Child<Right> right;
Branch(Key&& key, Value&& value, Child<Left> left, Child<Right> right) :
key { key }, value { value }, left { left }, right { right } {}
};
// [4. If a node is red, then both its children are black.]
// [5. Every path from a given node to any of its descendant NIL nodes contains
// the same number of black nodes.]
template <size_t Height>
struct RedBranch: public Node<color::Red, Height>
, public Branch<color::Black, color::Black, Height> {
public:
using RedBlackTree::Branch;
};
// [5. Every path from a given node to any of its descendant NIL nodes contains
// the same number of black nodes.]
template <size_t Height, color Left, color Right>
struct BlackBranch: public Node<color::Black, Height>
, public Branch<Left, Right, Height-1> {
public:
using RedBlackTree::Branch;
};
// ...
};
Where I'm stymied is giving the root
pointer that will be stored in the RedBlackTree
instance a type that satisfied property 2 but is still useful.
What I want is something like:
template <typename Key, typename Value>
class RedBlackTree {
//...
unique_ptr<Node<color::Black,?>> root = std::make_unique<Leaf>();
//...
}
(to borrow syntax from Java), so I can wildcard over the height of the tree. This of course, doesn't work.
I could get my code to compile if I did
template <typename Key, typename Value, size_t TreeHeight>
class RedBlackTree {
//...
unique_ptr<Node<color::Black,TreeHeight>> root = std::make_unique<Leaf>();
//...
}
But this isn't the type I want for the tree - I don't want the type of the tree itself
to reflect its height, otherwise the type of my tree might change when I insert a
key-value pair. I want to be able to update my root
to contain a pointer to a black
Node
of any height.
Back in haskell-land, I would have solved this problem using existential quantification:
data Color = Black | Red
data Node (color :: Color) (height :: Nat) key value where
Leaf :: Node 'Black 0 key value
BlackBranch :: Branch left right height key value -> Node 'Black (height+1) key value
RedBranch :: Branch 'Black 'Black height key value -> Node 'Red height key value
data Branch (left :: Color) (right :: Color) (childHeight :: Nat) key value = Branch
{ left :: Node left childHeight key value
, right :: Node right childHeight key value
, key :: key
, value :: value
}
data RedBlackTree key value where
RedBlackTree :: { root :: Node 'Black height key value } -> RedBlackTree key value
Is there an equivalent concept in C++14 (or maybe C++17), or an alternative way I could write my struct
defintions to be able to give root
a useful and correct type?
There are two types of quantifiers: universal quantifier and existential quantifier.
The symbol ∃ is called the existential quantifier. which is true when P(x) is true for every x. x . The symbol ∀ is called the universal quantifier.
The phrase "for every x'' (sometimes "for all x'') is called a universal quantifier and is denoted by ∀x. The phrase "there exists an x such that'' is called an existential quantifier and is denoted by ∃x.
It is usually denoted by the logical operator symbol ∃, which, when used together with a predicate variable, is called an existential quantifier ("∃x" or "∃(x)").
template<class K, class T>
struct NodeView {
virtual NodeView const* left() const = 0;
virtual NodeView const* right() const = 0;
virtual K const& key() const = 0;
virtual T const& value() const = 0;
private:
~NodeView() {} // no deleting it!
};
this is an interface.
Have your tree nodes implement this interface. They are permitted and encouraged to return nullptr
when they have no child on one side or the other.
In the base structure, take a root node as a template parameter. Check that it is black using template tomfoolery.
Use make_shared
to store it in a std::shared_ptr
auto tree = std::make_shared<std::decay_t<decltype(tree)>>(decltype(tree)(tree));
std::shared_ptr<NodeView const> m_tree = std::move(tree);
Where the m_tree
member is a member of your root management structure.
You now have read-only access to a generic tree. It is guaranteed to be a balanced red-black tree at compile time by the code that stores it. At run time, it is just guaranteed to be a tree.
You could leak more guarantee information into the interface I wrote above, but that would clutter the interface beyond what a reader usually needs to know. (like, have a different Red
and Black
interface node type).
Now if the body of one short function is too much to trust, and you'd rather trust a wall of template code, we can do this:
template<template<class...>class Test, class T>
struct restricted_shared_ptr {
template<class U,
std::enable_if_t< Test<U>{}, int> = 0
>
restricted_shared_ptr( std::shared_ptr<U> pin ):ptr(std::move(pin)) {}
restricted_shared_ptr(restricted_shared_ptr const&)=default;
restricted_shared_ptr(restricted_shared_ptr &&)=default;
restricted_shared_ptr& operator=(restricted_shared_ptr const&)=default;
restricted_shared_ptr& operator=(restricted_shared_ptr &&)=default;
restricted_shared_ptr() = default;
T* get() const { return ptr.get(); }
explicit operator bool() const { return (bool)ptr; }
T& operator*() const { return *ptr.get(); }
T* operator->() const { return ptr.get(); }
private:
std::shared_ptr<T> ptr;
};
Now we just write an arbitrary template check that says "this is good enough to satisfy me".
And store a restricted_shared_ptr< MyCheck, NodeView<K,T> const >
. There is no way to store the a type within this shared pointer that does not pass MyCheck
without undefined behavior.
Here, you need to trust MyCheck
's constructor to do what it says it does.
template<class T>
struct IsBlackNode:std::false_type{};
template<class K, class V, std::size_t Height, class Left, class Right>
struct IsBlackNode< BlackNode<K, V, Height, Left, Right> >:std::true_type{};
which is a requirement that only BlackNode
s can pass.
So a restricted_shared_ptr< IsBlackNode, NodeView<K, T> const >
is a shared pointer to something that passes the IsBlackNode
test and implements the NodeView<K,T>
interface.
Yakk's answer is more idiomatic C++ - this answer shows how to write (or at least begin writing) something more similar to the Haskell version.
When you see how much C++ is required to emulate Haskell, you may choose to use the native idioms instead.
most of your Haskell invariants (and properties) aren't really expressed statically in the type at all, but in the (runtime) code of the various constructors. The type system helps by guaranteeing one of those constructors really ran, and tracking which one it was for pattern-matching dispatch.
IIUC, your Haskell Node
type doesn't have four type parameters, but two:
data Node (color :: Color) (height :: Nat) key value where
fixes the types of the color and height - only the key and value types are undetermined. All four are records, but two of those records have fixed types.
So, the closest simple translation would be
template <typename Key, typename Value>
struct Node {
Color color_;
size_t height_;
Key key_;
Value val_;
};
The tricky part is there's no direct support for your different constructors - that's runtime information Haskell tracks for you.
So, a Leaf
is a Node
, whose constructor initialized the color and height fields for you, but the constructor used is also tracked as a part of the object created.
The closest equivalent to this, and the pattern-matching it gives you, would be a variant type like Boost.Variant, giving you something like:
// underlying record storage
template <typename Key, typename Value>
struct NodeBase {
Color color_;
size_t height_;
Key key_;
Value val_;
};
template <typename Key, typename Value>
struct Leaf: public NodeBase<Key,Value> {
Leaf(Key k, Value v) : NodeBase{Color::Black, 0, k, v} {}
// default other ctors here
};
// similarly for your BlackBranch and RedBranch constructors
template <typename Key, typename Value>
using Node = boost::variant<Leaf<Key,Value>,
RedBranch<Key,Value>,
BlackBranch<Key,Value>>;
note again that your Branch type has records for leftColor, rightColor, childHeight, and that only the key and value create type parameters.
Finally, where you'd use pattern matching to write a function on your different Node constructors in Haskell, you'd use
template <typename Key, typename Value>
struct myNodeFunction {
void operator() (Leaf &l) {
// use l.key_, l.value_, confirm l.height_==0, etc.
}
void operator() (RedBranch &rb) {
// use rb.key_, rb.value_, confirm rb.color_==Color::Red, etc.
}
void operator() (BlackBranch &bb) {
// you get the idea
}
};
and apply it like:
boost::apply_visitor(myNodeFunction<K,V>(), myNode);
or, if you're using this pattern a lot, you can wrap it up as
template <typename Key, typename Value,
template Visitor<typename,typename> >
void apply(Node<Key,Value> &node)
{
boost::apply_visitor(Visitor<Key,Value>{}, node);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With