Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest container or algorithm for unique reusable ids in C++

Tags:

c++

I have a need for unique reusable ids. The user can choose his own ids or he can ask for a free one. The API is basically

class IdManager {
public:
  int AllocateId();          // Allocates an id
  void FreeId(int id);       // Frees an id so it can be used again
  bool MarkAsUsed(int id);   // Let's the user register an id. 
                             // returns false if the id was already used.
  bool IsUsed(int id);       // Returns true if id is used.
};

Assume ids happen to start at 1 and progress, 2, 3, etc. This is not a requirement, just to help illustrate.

IdManager mgr;
mgr.MarkAsUsed(3);
printf ("%d\n", mgr.AllocateId());
printf ("%d\n", mgr.AllocateId());
printf ("%d\n", mgr.AllocateId());

Would print

1
2
4

Because id 3 has already been declared used.

What's the best container / algorithm to both remember which ids are used AND find a free id?

If you want to know the a specific use case, OpenGL's glGenTextures, glBindTexture and glDeleteTextures are equivalent to AllocateId, MarkAsUsed and FreeId

like image 838
gman Avatar asked Apr 12 '10 06:04

gman


2 Answers

My idea is to use std::set and Boost.interval so IdManager will hold a set of non-overlapping intervals of free IDs. AllocateId() is very simple and very quick and just returns the left boundary of the first free interval. Other two methods are slightly more difficult because it might be necessary to split an existing interval or to merge two adjacent intervals. However they are also quite fast.

So this is an illustration of the idea of using intervals:

IdManager mgr;    // Now there is one interval of free IDs:  [1..MAX_INT]
mgr.MarkAsUsed(3);// Now there are two interval of free IDs: [1..2], [4..MAX_INT]
mgr.AllocateId(); // two intervals:                          [2..2], [4..MAX_INT]
mgr.AllocateId(); // Now there is one interval:              [4..MAX_INT]
mgr.AllocateId(); // Now there is one interval:              [5..MAX_INT]

This is code itself:

#include <boost/numeric/interval.hpp>
#include <limits>
#include <set>
#include <iostream>


class id_interval 
{
public:
    id_interval(int ll, int uu) : value_(ll,uu)  {}
    bool operator < (const id_interval& ) const;
    int left() const { return value_.lower(); }
    int right() const {  return value_.upper(); }
private:
    boost::numeric::interval<int> value_;
};

class IdManager {
public:
    IdManager();
    int AllocateId();          // Allocates an id
    void FreeId(int id);       // Frees an id so it can be used again
    bool MarkAsUsed(int id);   // Let's the user register an id. 
private: 
    typedef std::set<id_interval> id_intervals_t;
    id_intervals_t free_;
};

IdManager::IdManager()
{
    free_.insert(id_interval(1, std::numeric_limits<int>::max()));
}

int IdManager::AllocateId()
{
    id_interval first = *(free_.begin());
    int free_id = first.left();
    free_.erase(free_.begin());
    if (first.left() + 1 <= first.right()) {
        free_.insert(id_interval(first.left() + 1 , first.right()));
    }
    return free_id;
}

bool IdManager::MarkAsUsed(int id)
{
    id_intervals_t::iterator it = free_.find(id_interval(id,id));
    if (it == free_.end()) {
        return false;
    } else {
        id_interval free_interval = *(it);
        free_.erase (it);
        if (free_interval.left() < id) {
            free_.insert(id_interval(free_interval.left(), id-1));
        }
        if (id +1 <= free_interval.right() ) {
            free_.insert(id_interval(id+1, free_interval.right()));
        }
        return true;
    }
}

void IdManager::FreeId(int id)
{
    id_intervals_t::iterator it = free_.find(id_interval(id,id));
    if (it != free_.end()  && it->left() <= id && it->right() > id) {
        return ;
    }
    it = free_.upper_bound(id_interval(id,id));
    if (it == free_.end()) {
        return ;
    } else {
        id_interval free_interval = *(it);

        if (id + 1 != free_interval.left()) {
            free_.insert(id_interval(id, id));
        } else {
            if (it != free_.begin()) {
                id_intervals_t::iterator it_2 = it;
                --it_2;
                if (it_2->right() + 1 == id ) {
                    id_interval free_interval_2 = *(it_2);
                    free_.erase(it);
                    free_.erase(it_2);
                    free_.insert(
                        id_interval(free_interval_2.left(), 
                                    free_interval.right()));
                } else {
                    free_.erase(it);
                    free_.insert(id_interval(id, free_interval.right()));
                }
            } else {
                    free_.erase(it);
                    free_.insert(id_interval(id, free_interval.right()));
            }
        }
    }
}

bool id_interval::operator < (const id_interval& s) const
{
    return 
      (value_.lower() < s.value_.lower()) && 
      (value_.upper() < s.value_.lower());
}


int main()
{
    IdManager mgr;

    mgr.MarkAsUsed(3);
    printf ("%d\n", mgr.AllocateId());
    printf ("%d\n", mgr.AllocateId());
    printf ("%d\n", mgr.AllocateId());

    return 0;
}
like image 147
8 revsuser184968 Avatar answered Nov 06 '22 21:11

8 revsuser184968


It would be good to know how many ids you're supposed to keep track of. If there's only a hundred or so, a simple set would do, with linear traversal to get a new id. If it's more like a few thousands, then of course the linear traversal will become a performance killer, especially considering the cache unfriendliness of the set.

Personally, I would go for the following:

  • set, which helps keeping track of the ids easily O(log N)
  • proposing the new id as the current maximum + 1... O(1)

If you don't allocate (in the lifetime of the application) more than max<int>() ids, it should be fine, otherwise... use a larger type (make it unsigned, use a long or long long) that's the easiest to begin with.

And if it does not suffice, leave me a comment and I'll edit and search for more complicated solutions. But the more complicated the book-keeping, the longer it'll take to execute in practice and the higher the chances of making a mistake.

like image 1
Matthieu M. Avatar answered Nov 06 '22 22:11

Matthieu M.