Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't STL's implementation of next_permutation use the binary search?

I came up with this question after reading the excellent answer of std::next_permutation Implementation Explanation. Please refer to that post for an explanation of the algorithm used by STL, but I'll replicate the code here for your reference

#include <vector>
#include <iostream>
#include <algorithm>

using namespace std;

template<typename It>
bool next_permutation(It begin, It end)
{
    if (begin == end)
        return false;

    It i = begin;
    ++i;
    if (i == end)
        return false;

    i = end;
    --i;

    while (true)
    {
        It j = i;
        --i;

        if (*i < *j)
        {
            It k = end;

            while (!(*i < *--k))
                /* pass */;

            iter_swap(i, k);
            reverse(j, end);
            return true;
        }

        if (i == begin)
        {
            reverse(begin, end);
            return false;
        }
    }
}

int main()
{
    vector<int> v = { 1, 2, 3, 4 };

    do
    {
        for (int i = 0; i < 4; i++)
        {
            cout << v[i] << " ";
        }
        cout << endl;
    }
    while (::next_permutation(v.begin(), v.end()));
}

Let's take a look at this part

It k = end;

while (!(*i < *--k))
    /* pass */;

iter_swap(i, k);

From my understanding, this linear scan could be replaced by a binary search, because by construction the elements after i are already in descending (or in case of duplicate elements, non-ascending) order. Suppose we have a boolean array whose items are *idx > *i for each j <= idx < end, then all I need to do is to find the largest index whose item is True. Such an index must exist because we have *j > *i, which means the array starts with a True.

I don't know enough C++ to confidently provide a working example, but here is a complete implementation of next_permutation in Rust. If you don't know Rust, then the following pseudocode should give a good idea of what I mean by "binary search". (Well yes, it's Python, which is readable enough to be referred to as pseudocode :)

from typing import List

def bisearch(last: List[bool]) -> int:
    p, q = 0, len(lst) - 1
    while p + 1 < q:
        mid = (p + q) // 2
        if lst[mid]:
            p = mid
        else:
            q = mid

    return q if lst[q] else q - 1


if __name__ == '__main__':
    for pos_count in range(1, 5):
        for neg_count in range(5):
            lst = [True] * pos_count + [False] * neg_count
            assert bisearch(lst) == pos_count - 1

Question: why doesn't STL's implementation of next_permutation use the binary search? I understand that finding i requires O(n), and that both O(n) + O(n) and O(n) + O(ln(n)) are O(n), but practically speaking binary search should still at least marginally improve performance?

like image 677
nalzok Avatar asked Jun 24 '21 01:06

nalzok


1 Answers

As @RichardCritten points out, just because we have better algorithmic complexity does not mean that the execution is faster. Additionally, the implementation is a bit more complex.

Below, we have made a very simple alteration to the original algorithm in the middle part.

Original Code:

if (*i < *j)
{
    It k = end;
    
    while (!(*i < *--k))
        /* pass */;
    
    iter_swap(i, k);
    reverse(j, end);
    return true;
}

Simple Binary Approach

if (*i < *j)
{
    auto test = std::lower_bound(std::make_reverse_iterator(end),
                                 std::make_reverse_iterator(i), *i);
    
    std::iter_swap(i, test);
    std::reverse(j, end);
    return true;
}

We make use of std::lower_bound as well as std::make_reverse_iterator since the given range is in descending order.

It should be noted that this implementation is not full proof and does not work when there are repeats. One of the main points is to demonstrate that even for simple cases, the original implementation is faster.

Here is a live ideone example demonstrating the speed of each approach. In our tests, we measure how long each approach takes to generate 10! = 3,628,800 permutations 100 times.

You will note that the linear implementation is almost twice as fast.

like image 85
Joseph Wood Avatar answered Oct 05 '22 23:10

Joseph Wood