Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implementing Disjoint Sets (Union Find) in C++

I am trying to implement Disjoint Sets for use in Kruskal's algorithm, but I am having trouble understanding exactly how it should be done and in particular, how to manage the forest of trees. After reading the Wikipedia description of Disjoint Sets and after reading the description in Introduction to Algorithms (Cormen et al) I have come up with the following:

    class DisjointSet
    {
    public:
        class   Node 
        {
        public:
            int         data;
            int         rank;

            Node*       parent;

            Node() : data(0), 
                     rank(0),
                     parent(this) { } // the constructor does MakeSet
        };

        Node*   find(Node*);
        Node*   merge(Node*, Node*); // Union
    };


    DisjointSet::Node*   DisjointSet::find(DisjointSet::Node* n)
    {
        if (n != n->parent) {
            n->parent = find(n->parent);
        }
        return n->parent;
    }

    DisjointSet::Node* DisjointSet::merge(DisjointSet::Node* x,
                                          DisjointSet::Node* y)
    {
        x = find(x);
        y = find(y);

        if (x->rank > y->rank) {
            y->parent = x;
        } else {
            x->parent = y;
            if (x->rank == y->rank) {
                ++(y->rank);
            }
        }
    }

I am pretty sure this is incomplete though and that I am missing something.

Introduction to Algorithms mentions that there should be a forest of trees, but it does not give any explanation for a practical implementation of this forest. I watched CS 61B Lecture 31: Disjoint Sets ( http://www.youtube.com/watch?v=wSPAjGfDl7Q ) and here the lecturer uses only an array to store both the forest and all its trees and values. There is no explicit 'Node' type of class as I have, mentioned. I have also found many other sources (I cannot post more than one link), which also use this technique. I would be happy to do this, except that this relies on the indices of the array for lookup and since I want to store values of type other than int, I need to use something else (std::map comes to mind).

Another issue that I am unsure about is memory allocation because I am using C++. I am storing trees of pointers and my MakeSet operation will be: new DisjointSet::Node; . Now, these Nodes only have pointers to their parents, so I'm not sure how to find the bottom of a tree. How will I be able to traverse my trees to deallocate them all?

I understand the basic concept of this data structure, but I'm just a bit confused about the implementation. Any advice and suggestions would be most welcome, thank you.

like image 960
Isaac Avatar asked Dec 21 '10 11:12

Isaac


People also ask

What is Find operation in disjoint-set?

A disjoint-set data structure is a data structure that keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets. A union-find algorithm is an algorithm that performs two useful operations on such a data structure: Find: Determine which subset a particular element is in.

In which algorithm find and union operations on disjoint sets may be used?

Applications of Disjoint Set It is used to find Cycle in a graph as in Kruskal's algorithm, DSUs are used.

What is the union of a disjoint-set?

A disjoint set union is a binary operation on two sets. The elements of any disjoint union can be described in terms of ordered pairs as (x, j), where j is the index that represents the origin of the element x. With the help of this operation, we can join all the different (distinct) elements of a pair of sets.


1 Answers

Not a perfect implementation by any means (I did write it after all!), but does this help?

/***
 * millipede: DisjointSetForest.h
 * Copyright Stuart Golodetz, 2009. All rights reserved.
 ***/

#ifndef H_MILLIPEDE_DISJOINTSETFOREST
#define H_MILLIPEDE_DISJOINTSETFOREST

#include <map>

#include <common/exceptions/Exception.h>
#include <common/io/util/OSSWrapper.h>
#include <common/util/NullType.h>

namespace mp {

/**
@brief  A disjoint set forest is a fairly standard data structure used to represent the partition of
        a set of elements into disjoint sets in such a way that common operations such as merging two
        sets together are computationally efficient.

This implementation uses the well-known union-by-rank and path compression optimizations, which together
yield an amortised complexity for key operations of O(a(n)), where a is the (extremely slow-growing)
inverse of the Ackermann function.

The implementation also allows clients to attach arbitrary data to each element, which can be useful for
some algorithms.

@tparam T   The type of data to attach to each element (arbitrary)
*/
template <typename T = NullType>
class DisjointSetForest
{
    //#################### NESTED CLASSES ####################
private:
    struct Element
    {
        T m_value;
        int m_parent;
        int m_rank;

        Element(const T& value, int parent)
        :   m_value(value), m_parent(parent), m_rank(0)
        {}
    };

    //#################### PRIVATE VARIABLES ####################
private:
    mutable std::map<int,Element> m_elements;
    int m_setCount;

    //#################### CONSTRUCTORS ####################
public:
    /**
    @brief  Constructs an empty disjoint set forest.
    */
    DisjointSetForest()
    :   m_setCount(0)
    {}

    /**
    @brief  Constructs a disjoint set forest from an initial set of elements and their associated values.

    @param[in]  initialElements     A map from the initial elements to their associated values
    */
    explicit DisjointSetForest(const std::map<int,T>& initialElements)
    :   m_setCount(0)
    {
        add_elements(initialElements);
    }

    //#################### PUBLIC METHODS ####################
public:
    /**
    @brief  Adds a single element x (and its associated value) to the disjoint set forest.

    @param[in]  x       The index of the element
    @param[in]  value   The value to initially associate with the element
    @pre
        -   x must not already be in the disjoint set forest
    */
    void add_element(int x, const T& value = T())
    {
        m_elements.insert(std::make_pair(x, Element(value, x)));
        ++m_setCount;
    }

    /**
    @brief  Adds multiple elements (and their associated values) to the disjoint set forest.

    @param[in]  elements    A map from the elements to add to their associated values
    @pre
        -   None of the elements to be added must already be in the disjoint set forest
    */
    void add_elements(const std::map<int,T>& elements)
    {
        for(typename std::map<int,T>::const_iterator it=elements.begin(), iend=elements.end(); it!=iend; ++it)
        {
            m_elements.insert(std::make_pair(it->first, Element(it->second, it->first)));
        }
        m_setCount += elements.size();
    }

    /**
    @brief  Returns the number of elements in the disjoint set forest.

    @return As described
    */
    int element_count() const
    {
        return static_cast<int>(m_elements.size());
    }

    /**
    @brief  Finds the index of the root element of the tree containing x in the disjoint set forest.

    @param[in]  x   The element whose set to determine
    @pre
        -   x must be an element in the disjoint set forest
    @throw Exception
        -   If the precondition is violated
    @return As described
    */
    int find_set(int x) const
    {
        Element& element = get_element(x);
        int& parent = element.m_parent;
        if(parent != x)
        {
            parent = find_set(parent);
        }
        return parent;
    }

    /**
    @brief  Returns the current number of disjoint sets in the forest (i.e. the current number of trees).

    @return As described
    */
    int set_count() const
    {
        return m_setCount;
    }

    /**
    @brief  Merges the disjoint sets containing elements x and y.

    If both elements are already in the same disjoint set, this is a no-op.

    @param[in]  x   The first element
    @param[in]  y   The second element
    @pre
        -   Both x and y must be elements in the disjoint set forest
    @throw Exception
        -   If the precondition is violated
    */
    void union_sets(int x, int y)
    {
        int setX = find_set(x);
        int setY = find_set(y);
        if(setX != setY) link(setX, setY);
    }

    /**
    @brief  Returns the value associated with element x.

    @param[in]  x   The element whose value to return
    @pre
        -   x must be an element in the disjoint set forest
    @throw Exception
        -   If the precondition is violated
    @return As described
    */
    T& value_of(int x)
    {
        return get_element(x).m_value;
    }

    /**
    @brief  Returns the value associated with element x.

    @param[in]  x   The element whose value to return
    @pre
        -   x must be an element in the disjoint set forest
    @throw Exception
        -   If the precondition is violated
    @return As described
    */
    const T& value_of(int x) const
    {
        return get_element(x).m_value;
    }

    //#################### PRIVATE METHODS ####################
private:
    Element& get_element(int x) const
    {
        typename std::map<int,Element>::iterator it = m_elements.find(x);
        if(it != m_elements.end()) return it->second;
        else throw Exception(OSSWrapper() << "No such element: " << x);
    }

    void link(int x, int y)
    {
        Element& elementX = get_element(x);
        Element& elementY = get_element(y);
        int& rankX = elementX.m_rank;
        int& rankY = elementY.m_rank;
        if(rankX > rankY)
        {
            elementY.m_parent = x;
        }
        else
        {
            elementX.m_parent = y;
            if(rankX == rankY) ++rankY;
        }
        --m_setCount;
    }
};

}

#endif
like image 145
Stuart Golodetz Avatar answered Sep 28 '22 01:09

Stuart Golodetz