To help teach myself C++, I'm working on a red-black tree implementation. Coming from Haskell, I, thought it'd be interesting to see if I could enforce the <a href="https://en.wikipedia.org/wiki/Red%E2%80%93black_tree#Properties">properties of a red-black tree</a> statically in C++'s type system: <blockquote> <ol> <li>A node is either red or black.</li> <li>The root is black [...]</li> <li>All leaves (NIL) are black.</li> <li>If a node is red, then both its children are black.</li> <li>Every path from a given node to any of its descendant NIL nodes contains the same number of black nodes. [...]</li> </ol> </blockquote> I figured out how to craft the types for the nodes of the tree to satisfy constraints 1, 3, 4, and 5 using templates: <pre class="prettyprint"><code>template <typename Key, typename Value> class RedBlackTree { private: enum class color { Black, Red }; // [1. A node is either red or black] template <color Color, size_t Height> struct Node {}; // [3. All leaves are black] struct Leaf : public Node<color::Black, 0> {}; template <color Left, color Right, size_t ChildHeight> struct Branch { public: template <color ChildColor> using Child = unique_ptr<Node<ChildColor, ChildHeight>>; Key key; Value value; Child<Left> left; Child<Right> right; Branch(Key&& key, Value&& value, Child<Left> left, Child<Right> right) : key { key }, value { value }, left { left }, right { right } {} }; // [4. If a node is red, then both its children are black.] // [5. Every path from a given node to any of its descendant NIL nodes contains // the same number of black nodes.] template <size_t Height> struct RedBranch: public Node<color::Red, Height> , public Branch<color::Black, color::Black, Height> { public: using RedBlackTree::Branch; }; // [5. Every path from a given node to any of its descendant NIL nodes contains // the same number of black nodes.] template <size_t Height, color Left, color Right> struct BlackBranch: public Node<color::Black, Height> , public Branch<Left, Right, Height-1> { public: using RedBlackTree::Branch; }; // ... }; </code></pre> Where I'm stymied is giving the <code>root</code> pointer that will be stored in the <code>RedBlackTree</code> instance a type that satisfied property 2 but is still useful. What I want is something like: <pre class="prettyprint"><code>template <typename Key, typename Value> class RedBlackTree { //... unique_ptr<Node<color::Black,?>> root = std::make_unique<Leaf>(); //... } </code></pre> (to borrow syntax from Java), so I can wildcard over the height of the tree. This of course, doesn't work. I could get my code to compile if I did <pre class="prettyprint"><code>template <typename Key, typename Value, size_t TreeHeight> class RedBlackTree { //... unique_ptr<Node<color::Black,TreeHeight>> root = std::make_unique<Leaf>(); //... } </code></pre> But this isn't the type I want for the tree - I don't want the type of the tree itself to reflect its height, otherwise the type of my tree might change when I insert a key-value pair. I want to be able to update my <code>root</code> to contain a pointer to a black <code>Node</code> of any height. Back in haskell-land, I would have solved this problem using existential quantification: <pre class="prettyprint lang-hs prettyprint-override"><code>data Color = Black | Red data Node (color :: Color) (height :: Nat) key value where Leaf :: Node 'Black 0 key value BlackBranch :: Branch left right height key value -> Node 'Black (height+1) key value RedBranch :: Branch 'Black 'Black height key value -> Node 'Red height key value data Branch (left :: Color) (right :: Color) (childHeight :: Nat) key value = Branch { left :: Node left childHeight key value , right :: Node right childHeight key value , key :: key , value :: value } data RedBlackTree key value where RedBlackTree :: { root :: Node 'Black height key value } -> RedBlackTree key value </code></pre> Is there an equivalent concept in C++14 (or maybe C++17), or an alternative way I could write my <code>struct</code> defintions to be able to give <code>root</code> a useful and correct type?

<pre class="prettyprint"><code>template<class K, class T> struct NodeView { virtual NodeView const* left() const = 0; virtual NodeView const* right() const = 0; virtual K const& key() const = 0; virtual T const& value() const = 0; private: ~NodeView() {} // no deleting it! }; </code></pre> this is an interface. Have your tree nodes implement this interface. They are permitted and encouraged to return <code>nullptr</code> when they have no child on one side or the other. In the base structure, take a root node as a template parameter. Check that it is black using template tomfoolery. Use <code>make_shared</code> to store it in a <code>std::shared_ptr</code> <pre class="prettyprint"><code>auto tree = std::make_shared<std::decay_t<decltype(tree)>>(decltype(tree)(tree)); std::shared_ptr<NodeView const> m_tree = std::move(tree); </code></pre> Where the <code>m_tree</code> member is a member of your root management structure. You now have read-only access to a generic tree. It is guaranteed to be a balanced red-black tree at compile time by the code that stores it. At run time, it is just guaranteed to be a tree. You could leak more guarantee information into the interface I wrote above, but that would clutter the interface beyond what a reader usually needs to know. (like, have a different <code>Red</code> and <code>Black</code> interface node type). Now if the body of one short function is too much to trust, and you'd rather trust a wall of template code, we can do this: <pre class="prettyprint"><code>template<template<class...>class Test, class T> struct restricted_shared_ptr { template<class U, std::enable_if_t< Test{}, int> = 0 > restricted_shared_ptr( std::shared_ptr pin ):ptr(std::move(pin)) {} restricted_shared_ptr(restricted_shared_ptr const&)=default; restricted_shared_ptr(restricted_shared_ptr &&)=default; restricted_shared_ptr& operator=(restricted_shared_ptr const&)=default; restricted_shared_ptr& operator=(restricted_shared_ptr &&)=default; restricted_shared_ptr() = default; T* get() const { return ptr.get(); } explicit operator bool() const { return (bool)ptr; } T& operator*() const { return *ptr.get(); } T* operator->() const { return ptr.get(); } private: std::shared_ptr<T> ptr; }; </code></pre> Now we just write an arbitrary template check that says "this is good enough to satisfy me". And store a <code>restricted_shared_ptr< MyCheck, NodeView<K,T> const ></code>. There is no way to store the a type within this shared pointer that does not pass <code>MyCheck</code> without undefined behavior. Here, you need to trust <code>MyCheck</code>'s constructor to do what it says it does. <pre class="prettyprint"><code>template<class T> struct IsBlackNode:std::false_type{}; template<class K, class V, std::size_t Height, class Left, class Right> struct IsBlackNode< BlackNode<K, V, Height, Left, Right> >:std::true_type{}; </code></pre> which is a requirement that only <code>BlackNode</code>s can pass. So a <code>restricted_shared_ptr< IsBlackNode, NodeView<K, T> const ></code> is a shared pointer to something that passes the <code>IsBlackNode</code> test and implements the <code>NodeView<K,T></code> interface.

Equivalent of existential quantification in C++?

Tags:

c++

haskell

c++14

existential-type

To help teach myself C++, I'm working on a red-black tree implementation. Coming from Haskell, I, thought it'd be interesting to see if I could enforce the properties of a red-black tree statically in C++'s type system:

A node is either red or black.

The root is black [...]

All leaves (NIL) are black.

If a node is red, then both its children are black.

Every path from a given node to any of its descendant NIL nodes contains the same number of black nodes. [...]

I figured out how to craft the types for the nodes of the tree to satisfy constraints 1, 3, 4, and 5 using templates:

template <typename Key, typename Value>
class RedBlackTree {
private:
  enum class color { Black, Red };

  // [1. A node is either red or black]
  template <color Color, size_t Height>
  struct Node {};

  // [3. All leaves are black]
  struct Leaf : public Node<color::Black, 0> {};

  template <color Left, color Right, size_t ChildHeight>
  struct Branch {
  public:
    template <color ChildColor>
    using Child = unique_ptr<Node<ChildColor, ChildHeight>>;

    Key key;
    Value value;
    Child<Left> left;
    Child<Right> right;

    Branch(Key&& key, Value&& value, Child<Left> left, Child<Right> right) :
      key { key }, value { value }, left { left }, right { right } {}
  };

  // [4. If a node is red, then both its children are black.]
  // [5. Every path from a given node to any of its descendant NIL nodes contains 
  //     the same number of black nodes.]
  template <size_t Height>
  struct RedBranch: public Node<color::Red, Height>
                  , public Branch<color::Black, color::Black, Height> {
  public:
    using RedBlackTree::Branch;
  };

  // [5. Every path from a given node to any of its descendant NIL nodes contains 
  //     the same number of black nodes.]
  template <size_t Height, color Left, color Right>
  struct BlackBranch: public Node<color::Black, Height>
                    , public Branch<Left, Right, Height-1> {
  public:
    using RedBlackTree::Branch;
  };

  // ...
};

Where I'm stymied is giving the root pointer that will be stored in the RedBlackTree instance a type that satisfied property 2 but is still useful.

What I want is something like:

template <typename Key, typename Value>
class RedBlackTree {
  //...
  unique_ptr<Node<color::Black,?>> root = std::make_unique<Leaf>();
  //...
}

(to borrow syntax from Java), so I can wildcard over the height of the tree. This of course, doesn't work.

I could get my code to compile if I did

template <typename Key, typename Value, size_t TreeHeight>
class RedBlackTree {
  //...
  unique_ptr<Node<color::Black,TreeHeight>> root = std::make_unique<Leaf>();
  //...
}

But this isn't the type I want for the tree - I don't want the type of the tree itself to reflect its height, otherwise the type of my tree might change when I insert a key-value pair. I want to be able to update my root to contain a pointer to a black Node of any height.

Back in haskell-land, I would have solved this problem using existential quantification:

data Color = Black | Red

data Node (color :: Color) (height :: Nat) key value where
  Leaf :: Node 'Black 0 key value
  BlackBranch :: Branch left right height key value -> Node 'Black (height+1) key value
  RedBranch :: Branch 'Black 'Black height key value -> Node 'Red height key value

data Branch (left :: Color) (right :: Color) (childHeight :: Nat) key value = Branch
  { left  :: Node left childHeight key value
  , right :: Node right childHeight key value
  , key   :: key
  , value :: value
  }

data RedBlackTree key value where
  RedBlackTree :: { root :: Node 'Black height key value } -> RedBlackTree key value

Is there an equivalent concept in C++14 (or maybe C++17), or an alternative way I could write my struct defintions to be able to give root a useful and correct type?

757

asked Aug 30 '16 16:08

rampion

2 Answers

template<class K, class T>
struct NodeView {
  virtual NodeView const* left() const = 0;
  virtual NodeView const* right() const = 0;
  virtual K const& key() const = 0;
  virtual T const& value() const = 0;
private:
  ~NodeView() {} // no deleting it!
};

this is an interface.

Have your tree nodes implement this interface. They are permitted and encouraged to return nullptr when they have no child on one side or the other.

In the base structure, take a root node as a template parameter. Check that it is black using template tomfoolery.

Use make_shared to store it in a std::shared_ptr

auto tree = std::make_shared<std::decay_t<decltype(tree)>>(decltype(tree)(tree));
std::shared_ptr<NodeView const> m_tree = std::move(tree);

Where the m_tree member is a member of your root management structure.

You now have read-only access to a generic tree. It is guaranteed to be a balanced red-black tree at compile time by the code that stores it. At run time, it is just guaranteed to be a tree.

You could leak more guarantee information into the interface I wrote above, but that would clutter the interface beyond what a reader usually needs to know. (like, have a different Red and Black interface node type).

Now if the body of one short function is too much to trust, and you'd rather trust a wall of template code, we can do this:

template<template<class...>class Test, class T>
struct restricted_shared_ptr {
  template<class U,
    std::enable_if_t< Test<U>{}, int> = 0
  >
  restricted_shared_ptr( std::shared_ptr<U> pin ):ptr(std::move(pin)) {}
  restricted_shared_ptr(restricted_shared_ptr const&)=default;
  restricted_shared_ptr(restricted_shared_ptr &&)=default;
  restricted_shared_ptr& operator=(restricted_shared_ptr const&)=default;
  restricted_shared_ptr& operator=(restricted_shared_ptr &&)=default;
  restricted_shared_ptr() = default;
  T* get() const { return ptr.get(); }
  explicit operator bool() const { return (bool)ptr; }
  T& operator*() const { return *ptr.get(); }
  T* operator->() const { return ptr.get(); }
private:
  std::shared_ptr<T> ptr;
};

Now we just write an arbitrary template check that says "this is good enough to satisfy me".

And store a restricted_shared_ptr< MyCheck, NodeView<K,T> const >. There is no way to store the a type within this shared pointer that does not pass MyCheck without undefined behavior.

Here, you need to trust MyCheck's constructor to do what it says it does.

template<class T>
struct IsBlackNode:std::false_type{};

template<class K, class V, std::size_t Height, class Left, class Right>
struct IsBlackNode< BlackNode<K, V, Height, Left, Right> >:std::true_type{};

which is a requirement that only BlackNodes can pass.

So a restricted_shared_ptr< IsBlackNode, NodeView<K, T> const > is a shared pointer to something that passes the IsBlackNode test and implements the NodeView<K,T> interface.

153

answered Oct 10 '22 18:10

Yakk - Adam Nevraumont

Note

Yakk's answer is more idiomatic C++ - this answer shows how to write (or at least begin writing) something more similar to the Haskell version.

When you see how much C++ is required to emulate Haskell, you may choose to use the native idioms instead.

tl;dr

most of your Haskell invariants (and properties) aren't really expressed statically in the type at all, but in the (runtime) code of the various constructors. The type system helps by guaranteeing one of those constructors really ran, and tracking which one it was for pattern-matching dispatch.

IIUC, your Haskell Node type doesn't have four type parameters, but two:

data Node (color :: Color) (height :: Nat) key value where

fixes the types of the color and height - only the key and value types are undetermined. All four are records, but two of those records have fixed types.

So, the closest simple translation would be

template <typename Key, typename Value>
struct Node {
    Color color_;
    size_t height_;
    Key key_;
    Value val_;
};

The tricky part is there's no direct support for your different constructors - that's runtime information Haskell tracks for you.

So, a Leaf is a Node, whose constructor initialized the color and height fields for you, but the constructor used is also tracked as a part of the object created.

The closest equivalent to this, and the pattern-matching it gives you, would be a variant type like Boost.Variant, giving you something like:

// underlying record storage
template <typename Key, typename Value>
struct NodeBase {
    Color color_;
    size_t height_;
    Key key_;
    Value val_;
};

template <typename Key, typename Value>
struct Leaf: public NodeBase<Key,Value> {
    Leaf(Key k, Value v) : NodeBase{Color::Black, 0, k, v} {}
    // default other ctors here
};
// similarly for your BlackBranch and RedBranch constructors

template <typename Key, typename Value>
using Node = boost::variant<Leaf<Key,Value>,
                            RedBranch<Key,Value>,
                            BlackBranch<Key,Value>>;

note again that your Branch type has records for leftColor, rightColor, childHeight, and that only the key and value create type parameters.

Finally, where you'd use pattern matching to write a function on your different Node constructors in Haskell, you'd use

template <typename Key, typename Value>
struct myNodeFunction {
    void operator() (Leaf &l) {
        // use l.key_, l.value_, confirm l.height_==0, etc.
    }
    void operator() (RedBranch &rb) {
        // use rb.key_, rb.value_, confirm rb.color_==Color::Red, etc.
    }
    void operator() (BlackBranch &bb) {
        // you get the idea
    }
};

and apply it like:

boost::apply_visitor(myNodeFunction<K,V>(), myNode);

or, if you're using this pattern a lot, you can wrap it up as

template <typename Key, typename Value,
          template Visitor<typename,typename> >
void apply(Node<Key,Value> &node)
{
    boost::apply_visitor(Visitor<Key,Value>{}, node);
}

answered Oct 10 '22 19:10

Useless

Related questions
                            
                                Memory debugging tools for Android NDK C++ code
                            
                                What is the fate of wchar_t in c++0x?
                            
                                Combining a linear algebra library with Boost::Units
                            
                                Calling C# method within a Java program
                            
                                Seeking suggestions for data representation of a probability distribution
                            
                                To find all the repeating substring in a given string
                            
                                Creating complex infinity with std::complex<T> in C++
                            
                                Is it possible to view C++ code with instantiated templates before it gets compiled (g++)?
                            
                                Idiomatic C++11 type promotion
                            
                                Running over an unrolled linked list takes around 40% of the code runtime - are there any obvious ways to optimise it?
                            
                                QMap::contains() VS QMap::find()
                            
                                Big csv file c++ parsing performance
                            
                                Should the assignment operator observe the assigned object's rvalueness?
                            
                                How do I use iomanip's setw, setfill, and left/right? Setfill isn't stopping its output
                            
                                Testing if member function exists using variadics
                            
                                Does the standard require std::tuple_size to be SFINAE-friendly?
                            
                                Why does GCC yield -nan and clang and intel yield +nan for 0.0/0.0?
                            
                                Why an inline declaration is not an incomplete type?
                            
                                "default constructor cannot be referenced" in Visual Studio 2015
                            
                                c++ destructor calls a delete operator?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With