Efficient way to get middle (median) of an std::set?

Tags:

std::set is a sorted tree. It provides begin and end methods so I can get minimum and maximum and lower_bound and upper_bound for binary search. But what if I want to get iterator pointing to the middle element (or one of them if there are even number of elements there)?

Is there an efficient way (O(log(size)) not O(size)) to do that?

{1} => 1
{1,2} => 1 or 2
{1,2,3} => 2
{1,2,3,4} => 2 or 3 (but in the same direction from middle as for {1,2})
{1,312,10000,14000,152333} => 10000

PS: Same question in Russian.

452

asked Nov 19 '17 11:11

Qwertiy

2 Answers

Depending on how often you insert/remove items versus look up the middle/median, a possibly more efficient solution than the obvious one is to keep a persistent iterator to the middle element and update it whenever you insert/delete items from the set. There are a bunch of edge cases which will need handling (odd vs even number of items, removing the middle item, empty set, etc.), but the basic idea would be that when you insert an item that's smaller than the current middle item, your middle iterator may need decrementing, whereas if you insert a larger one, you need to increment. It's the other way around for removals.

At lookup time, this is of course O(1), but it also has an essentially O(1) cost at each insertion/deletion, i.e. O(N) after N insertions, which needs to be amortised across a sufficient number of lookups to make it more efficient than brute forcing.

answered Sep 20 '22 14:09

pmdj

This suggestion is pure magic and will fail if there are some duplicated items

Depending on how often you insert/remove items versus look up the middle/median, a possibly more efficient solution than the obvious one is to keep a persistent iterator to the middle element and update it whenever you insert/delete items from the set. There are a bunch of edge cases which will need handling (odd vs even number of items, removing the middle item, empty set, etc.), but the basic idea would be that when you insert an item that's smaller than the current middle item, your middle iterator may need decrementing, whereas if you insert a larger one, you need to increment. It's the other way around for removals.

Suggestions

first suggestion is to use a std::multiset instead of std::set, so that it can work well when items could be duplicated
my suggestion is to use 2 multisets to track the smaller potion and the bigger potion and balance the size between them

Algorithm

1. keep the sets balanced, so that size_of_small==size_of_big or size_of_small + 1 == size_of_big

void balance(multiset<int> &small, multiset<int> &big)
{
    while (true)
    {
        int ssmall = small.size();
        int sbig = big.size();

        if (ssmall == sbig || ssmall + 1 == sbig) break; // OK

        if (ssmall < sbig)
        {
            // big to small
            auto v = big.begin();
            small.emplace(*v);
            big.erase(v);
        }
        else 
        {
            // small to big
            auto v = small.end();
            --v;
            big.emplace(*v);
            small.erase(v);
        }
    }
}

2. if the sets are balanced, the medium item is always the first item in the big set

auto medium = big.begin();
cout << *medium << endl;

3. take caution when add a new item

auto v = big.begin();
if (v != big.end() && new_item > *v)
    big.emplace(new_item );
else
    small.emplace(new_item );

balance(small, big);

complexity explained

it is O(1) to find the medium value
add a new item takes O(log n)
you can still search a item in O(log n), but you need to search 2 sets

answered Sep 19 '22 14:09

Clark

Related questions
                            
                                signal handler function in multithreaded environment
                            
                                Checking if argv[i] exists C++
                            
                                What is stored in this 26KB executable?
                            
                                Get decltype of function
                            
                                Secure Memory Allocator in C++
                            
                                Existing Standard Style and Coding standard documents [closed]
                            
                                Handcode GUI or use gui-designer tool [closed]
                            
                                C++ array[index] vs index[array] [duplicate]
                            
                                Visual C++ Precompiled Headers errors
                            
                                What does ^= mean in C/C++?
                            
                                const char* vs char* (C++)
                            
                                Why the Destructor in C++ de-allocated memory in reverse order of how they were initialised?
                            
                                Read a binary file (jpg) to a string using c++
                            
                                accessing pixel value of gray scale image in OpenCV
                            
                                Cannot open include file: 'QWebView': No such file or directory
                            
                                Sort a 2D array in C++ using built in functions(or any other method)?
                            
                                Smooth color transition algorithm
                            
                                catch(...) is not catching an exception, my program is still crashing
                            
                                C++ overloading array operator
                            
                                If there's a constexpr if statement, why not other constexpr statements too? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With