Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A data-structure for 1:1 mappings in python?

I have a problem which requires a reversable 1:1 mapping of keys to values.

That means sometimes I want to find the value given a key, but at other times I want to find the key given the value. Both keys and values are guaranteed unique.

x = D[y]
y == D.inverse[x]

The obvious solution is to simply invert the dictionary every time I want a reverse-lookup: Inverting a dictionary is very easy, there's a recipe here but for a large dictionary it can be very slow.

The other alternative is to make a new class which unites two dictionaries, one for each kind of lookup. That would most likely be fast but would use up twice as much memory as a single dict.

So is there a better structure I can use?

  • My application requires that this should be very fast and use as little as possible memory.
  • The structure must be mutable, and it's strongly desirable that mutating the object should not cause it to be slower (e.g. to force a complete re-index)
  • We can guarantee that either the key or the value (or both) will be an integer
  • It's likely that the structure will be needed to store thousands or possibly millions of items.
  • Keys & Valus are guaranteed to be unique, i.e. len(set(x)) == len(x) for for x in [D.keys(), D.valuies()]
like image 729
Salim Fadhley Avatar asked May 14 '09 15:05

Salim Fadhley


People also ask

Is there a map data structure in Python?

Python Maps also called ChainMap is a type of data structure to manage multiple dictionaries together as one unit. The combined dictionary contains the key and value pairs in a specific sequence eliminating any duplicate keys.

What is mapping data type in Python?

The mapping objects are used to map hash table values to arbitrary objects. In python there is mapping type called dictionary. It is mutable. The keys of the dictionary are arbitrary. As the value, we can use different kind of elements like lists, integers or any other mutable type objects.


3 Answers

The other alternative is to make a new class which unites two dictionaries, one for each kind of lookup. That would most likely be fast but would use up twice as much memory as a single dict.

Not really. Have you measured that? Since both dictionaries would use references to the same objects as keys and values, then the memory spent would be just the dictionary structure. That's a lot less than twice and is a fixed ammount regardless of your data size.

What I mean is that the actual data wouldn't be copied. So you'd spend little extra memory.

Example:

a = "some really really big text spending a lot of memory"

number_to_text = {1: a}
text_to_number = {a: 1}

Only a single copy of the "really big" string exists, so you end up spending just a little more memory. That's generally affordable.

I can't imagine a solution where you'd have the key lookup speed when looking by value, if you don't spend at least enough memory to store a reverse lookup hash table (which is exactly what's being done in your "unite two dicts" solution).

like image 193
nosklo Avatar answered Oct 16 '22 15:10

nosklo


class TwoWay:
    def __init__(self):
       self.d = {}
    def add(self, k, v):
       self.d[k] = v
       self.d[v] = k
    def remove(self, k):
       self.d.pop(self.d.pop(k))
    def get(self, k):
       return self.d[k]
like image 36
user17918 Avatar answered Oct 16 '22 16:10

user17918


The other alternative is to make a new class which unites two dictionaries, one for each > kind of lookup. That would most likely use up twice as much memory as a single dict.

Not really, since they would just be holding two references to the same data. In my mind, this is not a bad solution.

Have you considered an in-memory database lookup? I am not sure how it will compare in speed, but lookups in relational databases can be very fast.

like image 5
Shane C. Mason Avatar answered Oct 16 '22 15:10

Shane C. Mason