More efficient way to look up dictionary values whose keys start with same prefix

Q: What happens if you look for a key-value pair in a dictionary and that key-value does not exist Python?

If we try to access the value of key that does not exist in the dictionary, then it will raise KeyError. This can also be a way to check if exist in dict or not i.e. Here it confirms that the key 'test' exist in the dictionary.

I have a dictionary whose keys come in sets that share the same prefix, like this:

d = { "key1":"valA", "key123":"valB", "key1XY":"valC",
      "key2":"valD", "key2-22":"valE" }

Given a query string, I need to look up all the values associated with keys that start with that prefix, e.g. for query="key1" I need to get ["valA", "valB", "valC"]

My implementation below works but is too slow for a large number of queries since the dictionary d has about 30,000 keys and most of the keys are more than 20 characters long:

result = [d[s] for s in d.keys() if s.startswith(query)]

Is there a faster/more efficient way to implement this?

Can dictionary have same keys with different values?

Duplicate keys are not allowed. A dictionary maps each key to a corresponding value, so it doesn't make sense to map a particular key more than once.

Can a dictionary have two keys with the same value two values with the same key?

No, each key in a dictionary should be unique. You can't have two keys with the same value. Attempting to use the same key again will just overwrite the previous value stored. If a key needs to store multiple values, then the value associated with the key should be a list or another dictionary.

How do you compare two keys in a dictionary?

The compare method cmp() is used in Python to compare values and keys of two dictionaries. If method returns 0 if both dictionaries are equal, 1 if dic1 > dict2 and -1 if dict1 < dict2.

What happens if you look for a key-value pair in a dictionary and that key-value does not exist Python?

If we try to access the value of key that does not exist in the dictionary, then it will raise KeyError. This can also be a way to check if exist in dict or not i.e. Here it confirms that the key 'test' exist in the dictionary.

You can avoid producing the intermediate list generated by dict.keys() (in python 2.x):

result = [d[key] for key in d if key.startswith(query)]

But you most likely want to use a trie instead of a dictionary, so you can find all the values associated with a key with a common prefix (a trie is similar to a tree based on prefixes).

Here you can find some different implementation of tries.

A trie for keys "A", "to", "tea", "ted", "ten", "i", "in", and "inn". (source wikipedia)

Let's compare the timings for the different solutions:

# create a dictionary with 30k entries
d = {str(x):str(x) for x in xrange(1, 30001)}
query = '108'

# dict with keys()
%timeit [d[s] for s in d.keys() if s.startswith(query)]

    100 loops, best of 3: 8.87 ms per loop

# dict without keys()
%timeit [d[s] for s in d if s.startswith(query)]

    100 loops, best of 3: 7.83 ms per loop

# 11.72% improvement

# PyTrie (https://pypi.python.org/pypi/PyTrie/0.2)
import pytrie
pt = pytrie.Trie(d)

%timeit [pt[s] for s in pt.iterkeys(query)]

    1000 loops, best of 3: 320 µs per loop

# 96.36% improvement

# datrie (https://pypi.python.org/pypi/datrie/0.7)
import datrie
dt = datrie.Trie('0123456789')
for key, val in d.iteritems():
    dt[unicode(key)] = val

%timeit [dt[s] for s in dt.keys(unicode(query))]

    10000 loops, best of 3: 162 µs per loop

# 98.17% improvement

More efficient way to look up dictionary values whose keys start with same prefix

Tags:

performance

python

dictionary

lookup

startswith

I Z

People also ask

1 Answers

enrico.bacis

Recent Activity

Donate For Us

More efficient way to look up dictionary values whose keys start with same prefix

Tags:

performance

python

dictionary

lookup

startswith

I Z

People also ask

1 Answers

enrico.bacis

Related questions

Recent Activity

Donate For Us