Suppose I have a multi-level dictionary like this
mydict = { 'first': { 'second': { 'third': { 'fourth': 'the end' } } } }
I'd like to access it like this
test = get_entry(mydict, 'first.second.third.fourth')
What I have so far is
def get_entry(dict, keyspec): keys = keyspec.split('.') result = dict[keys[0]] for key in keys[1:]: result = dict[key] return result
Are there more efficient ways to do it? According to %timeit the runtime of the function is 1.26us, while accessing the dictionary the standard way like this
foo = mydict['first']['second']['third']['fourth']
takes 541ns. I'm looking for ways to trim it to 800ns range if possible.
Thanks
Basically the same way you would flatten a nested list, you just have to do the extra work for iterating the dict by key/value, creating new keys for your new dictionary and creating the dictionary at final step. For Python >= 3.3, change the import to from collections.
The values of a dictionary can be any type of Python data. So, dictionaries are unordered key-value-pairs.
a : to make level or smooth. b : to knock down also : to defeat decisively. c : to make dull or uninspired —often used with out. d : to make lusterless flatten paint. e : to stabilize especially at a lower level.
Dictionaries can be nested to any depth. Dictionaries are mutable. Items are accessed by their position in a dictionary. All the keys in a dictionary must be of the same type.
I got a 20% performance boost by tightening up the code a bit but a whopping 400% increase by using a cache for split strings. That only makes a difference if you use the same spec multiple times. Here are sample implementations and a profile script to test.
test.py
mydict = { 'first': { 'second': { 'third': { 'fourth': 'the end' } } } } # original def get_entry(dict, keyspec): keys = keyspec.split('.') result = dict[keys[0]] for key in keys[1:]: result = result[key] return result # tighten up code def get_entry_2(mydict, keyspec): for key in keyspec.split('.'): mydict = mydict[key] return mydict # use a cache cache = {} def get_entry_3(mydict, keyspec): global cache try: spec = cache[keyspec] except KeyError: spec = tuple(keyspec.split('.')) cache[keyspec] = spec for key in spec: mydict = mydict[key] return mydict if __name__ == "__main__": test = get_entry(mydict, 'first.second.third.fourth') print(test)
profile.py
from timeit import timeit print("original get_entry") print(timeit("get_entry(mydict, 'first.second.third.fourth')", setup="from test import get_entry, mydict")) print("get_entry_2 with tighter code") print(timeit("get_entry_2(mydict, 'first.second.third.fourth')", setup="from test import get_entry_2, mydict")) print("get_entry_3 with cache of split spec") print(timeit("get_entry_3(mydict, 'first.second.third.fourth')", setup="from test import get_entry_3, mydict")) print("just splitting a spec") print(timeit("x.split('.')", setup="x='first.second.third.fourth'"))
The timing on my machine is
original get_entry 4.148535753000033 get_entry_2 with tighter code 3.2986323120003362 get_entry_3 with cache of split spec 1.3073233439990872 just splitting a spec 1.0949148639992927
Notice that splitting the spec is a comparatively expensive operation for this function. That's why caching helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With