I need a data structure that supports FAST insertion and deletion of (key, value) pairs, as well as "get random key", which does the same thing as random.choice(dict.keys()) for a dictionary. I've searched on the internet, and most people seem to be satisfied with the random.choice(dict.keys()) approach, despite it being linear time.
I'm aware that implementing this faster is possible:
Is there any easy way to get this in Python, though? It seems like there should be!
To get a random value from a dictionary in Python, you can use the random module choice() function, list() function and dictionary values() function. If you want to get a random key from a dictionary, you can use the dictionary keys() function instead.
values()[0] to pull out the first value of a list inside a dictionary. Bookmark this question.
We can also fetch the key from a value by matching all the values using the dict. item() and then print the corresponding key to the given value.
This may not specifically relevant to the specific use case listed above, but this is the question I get when searching for a way to nicely get a hold of "any" key in a dictionary.
If you don't need a truly random choice, but just need some arbitrary key, here are two simple options I've found:
key = next(iter(d)) # may be a little expensive, but presumably O(1)
The second is really useful only if you're happy to consume the key+value from the dictionary, and due to the mutation(s) will not be as algorithmically efficient:
key, value = d.popitem() # may not be O(1) especially if next step
if MUST_LEAVE_VALUE:
d[key] = value
[edit: Completely rewritten, but keeping question here with comments intact.]
Below is the realization of a dictionary wrapper with O(1) get/insert/delete, and O(1) picking of a random element.
The main idea is that we want to have an O(1) but arbitrary map from range(len(mapping))
to the keys. This will let us get random.randrange(len(mapping))
, and pass it through the mapping.
This is very difficult to implement until you realize that we can take advantage of the fact that the mapping can be arbitrary. The key idea to achieve a hard bound of O(1) time is this: whenever you delete an element, you swap it with the highest arbitrary-id element, and update any pointers.
class RandomChoiceDict(object):
def __init__(self):
self.mapping = {} # wraps a dictionary
# e.g. {'a':'Alice', 'b':'Bob', 'c':'Carrie'}
# the arbitrary mapping mentioned above
self.idToKey = {} # e.g. {0:'a', 1:'c' 2:'b'},
# or {0:'b', 1:'a' 2:'c'}, etc.
self.keyToId = {} # needed to help delete elements
Get, set, and delete:
def __getitem__(self, key): # O(1)
return self.mapping[key]
def __setitem__(self, key, value): # O(1)
if key in self.mapping:
self.mapping[key] = value
else: # new item
newId = len(self.mapping)
self.mapping[key] = value
# add it to the arbitrary bijection
self.idToKey[newId] = key
self.keyToId[key] = newId
def __delitem__(self, key): # O(1)
del self.mapping[key] # O(1) average case
# see http://wiki.python.org/moin/TimeComplexity
emptyId = self.keyToId[key]
largestId = len(self.mapping) # about to be deleted
largestIdKey = self.idToKey[largestId] # going to store this in empty Id
# swap deleted element with highest-id element in arbitrary map:
self.idToKey[emptyId] = largestIdKey
self.keyToId[largestIdKey] = emptyId
del self.keyToId[key]
del self.idToKey[largestId]
Picking a random (key,element):
def randomItem(self): # O(1)
r = random.randrange(len(self.mapping))
k = self.idToKey[r]
return (k, self.mapping[k])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With