Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python equivalent of std::set and std::multimap

I'm porting a C++ program to Python. There are some places where it uses std::set to store objects that define their own comparison operators. Since the Python standard library has no equivalent of std::set (a sorted key-value mapping data structure) I tried using a normal dictionary and then sorting it when iterating, like this:

def __iter__(self):
    items = self._data.items()
    items.sort()
    return iter(items)

However, profiling has shown that all the calls from .sort() to __cmp__ are a serious bottleneck. I need a better data structure - essentially a sorted dictionary. Does anyone know of an existing implementation? Failing that, any recommendations on how I should implement this? Read performance is more important than write performance and time is more important than memory.

Bonus points if it supports multiple values per key, like the C++ std::multimap.

Note that the OrderedDict class doesn't fit my needs, because it returns items in the order of insertion, whereas I need them sorted using their __cmp__ methods.

like image 673
EMP Avatar asked Feb 22 '10 07:02

EMP


People also ask

What is Python MultiMap?

MultiMap 1.0. 3Mapping class which allows multiple values per key and preserves order by value; values with the same key are not grouped together.


2 Answers

For the sorted dictionary, you can (ab)use the stable nature of python's timsort: basically, keep the items partially sorted, append items at the end when needed, switching a "dirty" flag, and sort the remaining before iterating. See this entry for details and implementation (A Martelli's answer): Key-ordered dict in Python

like image 105
LeMiz Avatar answered Oct 03 '22 19:10

LeMiz


You should use sort(key=...).
The key function you use will be related to the cmp you are already using. The advantage is that the key function is called n times whereas the cmp is called nlog n times, and typically key does half the work that cmp does

If you can include your __cmp__() we can probably show you how to convert it to a key function

If you are doing lots of iterations between modifications, you should cache the value of the sorted items.

like image 28
John La Rooy Avatar answered Oct 03 '22 19:10

John La Rooy