Efficient Dictionary Searching?

Tags:

I had a quick question about efficiency of searching through large dictionaries in Python. I am reading a large comma-separated file and getting a key and value from each line. If my key is already in the dictionary, I'm adding the value to the value listed in the dictionary, if the key doesn't exist in the dictionary, I simply add the value. Previously I was using this:

if key in data_dict.keys():     add values else:     data_dict[key] = value

This starts pretty fast, but as the dictionary grows it becomes slower and slower, to the point where I can't use it at all. I changed the way I search for the key in the dictionary to this:

try:     # This will fail if key not present     data_dict[keyStr] = input_data[keyStr] + load_val except:     data_dict[keyStr] = load_val

This is infinitely faster, and can read / write over 350,000 lines of code in 3 seconds.

My question was why does the if key in data_dict.keys(): command take sooo much longer than the call to try: data_dict[keyStr]? And why wouldn't Python utilize the try statement when searching for a key in a dictionary?

964

asked Sep 30 '13 21:09

Brumdog22

2 Answers

The problem is that for every test you're generating a new list of keys with .keys(). As the list of keys gets longer, the time required goes up. Also as noted by dckrooney, the search for the key becomes linear instead of taking advantage of the hash-table structure of the dictionary.

Replace with:

if key in data_dict:

168

answered Sep 25 '22 20:09

Mark Ransom

data_dict.keys() returns an unsorted list of keys in the dictionary. Thus each time you check to see if a given key is in the dictionary, you're doing a linear search across the list of keys (an O(n) operation). The longer your list, the longer it takes to search for a given key.

Contrast this to data_dict[keyStr]. This performs a hash lookup, which is an O(1) operation. It doesn't (directly) depend on the number of keys in the dictionary; even as you add more keys, the time to check if a given key is in the dictionary stays constant.

answered Sep 21 '22 20:09

dckrooney

Related questions
                            
                                Assets folder in Android Studio?
                            
                                Return two variables in one function [duplicate]
                            
                                Jenkins time-out while fetching from Github in 10 mins.
                            
                                SKShapeNode - Animate color change
                            
                                Scala-JS for real web project [closed]
                            
                                undefined local variable or method `root_path'
                            
                                How to set the executable name in CMake?
                            
                                Why does AngularJS with ui-router keep firing the $stateChangeStart event?
                            
                                remove needs a query at src/mongo/shell/collection.js
                            
                                python/pip error on osx
                            
                                Iterate over object attributes and modify them
                            
                                Why does my PFAnalytics not have trackAppOpeneWithLaunchOptions function? (IOS SWIFT)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With