Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to get median value from sorted map

Tags:

c++

stl

I am using a std::map. Sometimes I will do an operation like: finding the median value of all items. e.g if I add

1 "s"
2 "sdf"
3 "sdfb"
4 "njw"
5 "loo"

then the median is 3.

Is there some solution without iterating over half the items in the map?

like image 801
Raymond Avatar asked Aug 10 '10 06:08

Raymond


2 Answers

I think you can solve the problem by using two std::map. One for smaller half of items (mapL) and second for the other half (mapU). When you have insert operation. It will be either case:

  • add item to mapU and move smallest element to mapL
  • add item to mapL and move greatest element to mapU

In case the maps have different size and you insert element to the one with smaller number of elements you skip the move section. The basic idea is that you keep your maps balanced so the maximum size difference is 1 element. As far as I know STL all operations should work in O(ln(n)) time. Accessing smallest and greatest element in map can be done by using iterator. When you have n_th position query just check map sizes and return greatest element in mapL or smallest element in mapR.

The above usage scenario is for inserting only but you can extend it to deleting items as well but you have to keep track of which map holds item or try to delete from both.

Here is my code with sample usage:

#include <iostream>
#include <string>
#include <map>
using namespace std;

typedef pair<int,string> pis;
typedef map<int,string>::iterator itis;

map<int,string>Left;
map<int,string>Right;

itis get_last(map<int,string> &m){
    return (--m.end());
}

int add_element(int key, string val){
    if (Left.empty()){
        Left.insert(make_pair(key,val));
        return 1;
    }

    pis maxl = *get_last(Left);
    if (key <= maxl.first){
        Left.insert(make_pair(key,val));
        if (Left.size() > Right.size() + 1){
            itis to_rem = get_last(Left);
            pis cpy = *to_rem;
            Left.erase(to_rem);
            Right.insert(cpy);
        }
        return 1;
    } else {
        Right.insert(make_pair(key,val));
        if (Right.size() > Left.size()){
            itis to_rem = Right.begin();
            pis cpy = *to_rem;
            Right.erase(to_rem);
            Left.insert(*to_rem);
        }
        return 2;
    }   
}

pis get_mid(){
    int size = Left.size() + Right.size();
    if (Left.size() >= size / 2){
        return *(get_last(Left));
    }
    return *(Right.begin());
}

int main(){
    Left.clear();
    Right.clear();

    int key;
    string val;
    while (!cin.eof()){
        cin >> key >> val;
        add_element(key,val);
        pis mid = get_mid();
        cout << "mid " << mid.first << " " << mid.second << endl;
    }
}
like image 124
jethro Avatar answered Oct 22 '22 10:10

jethro


I know no way to get the median from a pure STL map quickly for big maps. If your map is small or you need the median rarely you should use the linear advance to n/2 anyway I think - for the sake of simplicity and being standard.

You can use the map to build a new container that offers median: Jethro suggested using two maps, based on this perhaps better would be a single map and a continuously updated median iterator. These methods suffer from the drawback that you have to reimplement every modifiying operation and in jethro's case even the reading operations.

A custom written container will also do what you what, probably most efficiently but for the price of custom code. You could try, as was suggested to modify an existing stl map implementation. You can also look for existing implementations.

There is a super efficient C implementation that offers most map functionality and also random access called Judy Arrays. These work for integer, string and byte array keys.

like image 30
Peter G. Avatar answered Oct 22 '22 12:10

Peter G.