What is the fastest way to get an arbitrary element out of a Python dictionary?

Tags:

I have a dict with approximately 17,000 keys. I would like to select one key at a time--it doesn't matter which one, and I don't need it to happen in any particular order (random is fine). However, after I select a key, I will alter the dictionary, perhaps by adding or deleting a key, before selecting another one. Therefore, I do not have a set list of keys that I can iterate through.

Since I don't need to access them in any particular order, I could convert the dict keys into a list each time, and then pop the first element. However, since there are 17,000 keys, making a list takes approximately 0.0005-7 seconds over each iteration, which will take too much time for what I need. Is there a shortcut I could take so that I don't have to compile an enormous list out of dict keys each time I want to select a single key?

261

asked Nov 29 '16 17:11

hannah

1 Answers

There are multiple ways, but you'll need to make some tradeoffs. One way is to empty the dictionary out using popitem; it is atomic, and will use an arbitrary order. But it modifies the dictionary itself; whatever item was selected isn't in it anymore. The next method that comes to mind is iterating as usual, even while modifying the dictionary; the order of items might change, so you could get items any number of times. To track that, you could build a second set of visible keys. It's reasonably cheap to add keys to the set, cheap to check if each item is in it, and when you've gone through the whole dictionary you can check if the set matches the dictionary's keys to determine if there are ones you missed (or removed). You do end up building a key set but only one item per iteration; in the pessimal case we have the dictionary being modified in such a way we scan through the whole set of visited items before finding the new item.

Is there a reason this data needs to be kept in a dictionary only? For instance, if we consider a system where we're shuffling songs, we might not want to visit the whole library but only place a limit on how recently a song has been played. That could be more efficiently handled using a list of songs wherein we can read a random index, a set of recently played songs to avoid duplicates, and a queue (perhaps in a list or deque) of songs allowing us to update the set in order (removing the last entry each iteration). Bear in mind that references are reasonably cheap.

Rethinking one more step we wouldn't need the keys to check for duplicates if they simply aren't in our candidates; by just swapping the oldest played song with the randomly selected next song, both the played and candidate lists stay constant size and no lookups are needed since songs are in only one of the lists.

Another idea is to use collections.ChainMap to keep a consistent view into two dictionaries; ones that have been visited and ones that have not. You could then migrate items from the latter to the former by way of popitem, ensuring a readable method of processing everything in the collection while keeping it dictionary-like.

def getnewitem(chainmap):
    # Raises KeyError when finished
    key,value=chainmap.maps[0].popitem()
    chainmap.maps[1][key]=value
    return key,value

As that means both dictionaries keep changing, it's likely not the fastest overall, but it maintains both a dictionarylike collection and a capability to process all items. It does lose the ability to directly delete items, since ChainMap cannot hide inherited mappings; you'd need to remove them from the backing dictionaries.

130

answered Oct 11 '22 12:10

Yann Vernier

Related questions
                            
                                OpenCV imread hanging when called from a web request
                            
                                How to test database connectivity in python?
                            
                                Connect to SMTP (SSL or TLS) using Python
                            
                                True=False assignment in Python 2.x [duplicate]
                            
                                How to find the path to a SSL cert file?
                            
                                How to terminate multiprocessing Pool processes?
                            
                                Mocking Oauth providers while testing
                            
                                Find subset with K elements that are closest to eachother
                            
                                how to convert a bs4.element.ResultSet to strings? Python
                            
                                Why does a function that returns itself max out recursion in python 3
                            
                                Chi squared test in Python
                            
                                Pandas time series time between events
                            
                                Run a chord callback even if the main tasks fail
                            
                                Is there a pythonic way to skip decoration on a subclass' method?
                            
                                How does pandas calculate skew
                            
                                Python Pandas, create empty DataFrame specifying column dtypes
                            
                                Difference between self.request and request in Django class-based view
                            
                                Pycharm expected type 'optional[bytes]' got 'str' instead
                            
                                Difference between numpy.float and numpy.float64
                            
                                Django app defaults?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the fastest way to get an arbitrary element out of a Python dictionary?

Tags:

python

dictionary

hannah

People also ask

1 Answers

Yann Vernier

Recent Activity

Donate For Us