Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python get random key in a dictionary in O(1)

Tags:

I need a data structure that supports FAST insertion and deletion of (key, value) pairs, as well as "get random key", which does the same thing as random.choice(dict.keys()) for a dictionary. I've searched on the internet, and most people seem to be satisfied with the random.choice(dict.keys()) approach, despite it being linear time.

I'm aware that implementing this faster is possible:

  • I could use a resizing hash table. If I maintain that the ratio of keys to slots is between 1 and 2, then I can just choose random indices until I hit a non-empty slot. I only look at 1 to 2 keys, in expectation.
  • I can get these operations in guaranteed worst case O(log n) using an AVL tree, augmenting with rank.

Is there any easy way to get this in Python, though? It seems like there should be!

like image 229
WuTheFWasThat Avatar asked May 31 '12 20:05

WuTheFWasThat


People also ask

How do I randomly select a dictionary key in Python?

To get a random value from a dictionary in Python, you can use the random module choice() function, list() function and dictionary values() function. If you want to get a random key from a dictionary, you can use the dictionary keys() function instead.

What does .values 0 do in Python?

values()[0] to pull out the first value of a list inside a dictionary. Bookmark this question.

Can you get key from value in dictionary Python?

We can also fetch the key from a value by matching all the values using the dict. item() and then print the corresponding key to the given value.


2 Answers

This may not specifically relevant to the specific use case listed above, but this is the question I get when searching for a way to nicely get a hold of "any" key in a dictionary.

If you don't need a truly random choice, but just need some arbitrary key, here are two simple options I've found:

key = next(iter(d))    # may be a little expensive, but presumably O(1)

The second is really useful only if you're happy to consume the key+value from the dictionary, and due to the mutation(s) will not be as algorithmically efficient:

key, value = d.popitem()     # may not be O(1) especially if next step
if MUST_LEAVE_VALUE:
    d[key] = value
like image 72
natevw Avatar answered Oct 26 '22 05:10

natevw


[edit: Completely rewritten, but keeping question here with comments intact.]

Below is the realization of a dictionary wrapper with O(1) get/insert/delete, and O(1) picking of a random element.

The main idea is that we want to have an O(1) but arbitrary map from range(len(mapping)) to the keys. This will let us get random.randrange(len(mapping)), and pass it through the mapping.

This is very difficult to implement until you realize that we can take advantage of the fact that the mapping can be arbitrary. The key idea to achieve a hard bound of O(1) time is this: whenever you delete an element, you swap it with the highest arbitrary-id element, and update any pointers.

class RandomChoiceDict(object):
    def __init__(self):
        self.mapping = {}  # wraps a dictionary
                           # e.g. {'a':'Alice', 'b':'Bob', 'c':'Carrie'}

        # the arbitrary mapping mentioned above
        self.idToKey = {}  # e.g. {0:'a', 1:'c' 2:'b'}, 
                           #      or {0:'b', 1:'a' 2:'c'}, etc.

        self.keyToId = {}  # needed to help delete elements

Get, set, and delete:

    def __getitem__(self, key):  # O(1)
        return self.mapping[key]

    def __setitem__(self, key, value):  # O(1)
        if key in self.mapping:
            self.mapping[key] = value
        else: # new item
            newId = len(self.mapping)

            self.mapping[key] = value

            # add it to the arbitrary bijection
            self.idToKey[newId] = key
            self.keyToId[key] = newId

    def __delitem__(self, key):  # O(1)
        del self.mapping[key]  # O(1) average case
                               # see http://wiki.python.org/moin/TimeComplexity

        emptyId = self.keyToId[key]
        largestId = len(self.mapping)  # about to be deleted
        largestIdKey = self.idToKey[largestId]  # going to store this in empty Id

        # swap deleted element with highest-id element in arbitrary map:
        self.idToKey[emptyId] = largestIdKey
        self.keyToId[largestIdKey] = emptyId

        del self.keyToId[key]
        del self.idToKey[largestId]

Picking a random (key,element):

    def randomItem(self):  # O(1)
        r = random.randrange(len(self.mapping))
        k = self.idToKey[r]
        return (k, self.mapping[k])
like image 44
ninjagecko Avatar answered Oct 26 '22 05:10

ninjagecko