Let's say I've got a nested dictionary of the form:
{'geo': {'bgcolor': 'white','lakecolor': 'white','caxis': {'gridcolor': 'white', 'linecolor': 'white',}},
'title': {'x': 0.05},
'yaxis': {'automargin': True,'linecolor': 'white','zerolinecolor': 'white','zerolinewidth': 2}
}
How can you work your way through that dict and make a list of each complete key path that contains the value 'white'
?
Using a function defined by user jfs in the post Search for a value in a nested dictionary python lets you check whether or not 'white'
occurs at least one time and also returns the path:
# dictionary
d={'geo': {'bgcolor': 'white','lakecolor': 'white','caxis': {'gridcolor': 'white', 'linecolor': 'white',}},
'title': {'x': 0.05},
'yaxis': {'automargin': True,'linecolor': 'white','ticks': '','zerolinecolor': 'white','zerolinewidth': 2}
}
# function:
def getpath(nested_dict, value, prepath=()):
for k, v in nested_dict.items():
path = prepath + (k,)
if v == value: # found value
return path
elif hasattr(v, 'items'): # v is a dict
p = getpath(v, value, path) # recursive call
if p is not None:
return p
getpath(d,'white')
# out:
('geo', 'bgcolor')
But 'white' occurs other places too, like in :
1. d['geo']['lakecolor']
2: d['geo']['caxis']['gridcolor']
3: d['yaxis']['linecolor']
How can I make sure that the function finds all paths?
I've tried applying the function above until it returns none
while eliminating found paths one by one, but that quickly turned into an ugly mess.
Thank you for any suggestions!
This is a perfect use case to write a generator:
def find_paths(haystack, needle):
if haystack == needle:
yield ()
if not isinstance(haystack, dict):
return
for key, val in haystack.items():
for subpath in find_paths(val, needle):
yield (key, *subpath)
You can use it as follows:
d = {
'geo': {'bgcolor': 'white','lakecolor': 'white','caxis': {'gridcolor': 'white', 'linecolor': 'white',}},
'title': {'x': 0.05},
'yaxis': {'automargin': True,'linecolor': 'white','ticks': '','zerolinecolor': 'white','zerolinewidth': 2}
}
# you can iterate over the paths directly...
for path in find_paths(d, 'white'):
print('found at path: ', path)
# ...or you can collect them into a list:
paths = list(find_paths(d, 'white'))
print('found at paths: ' + repr(paths))
The generator approach has the advantage that it doesn't need to create an object to keep all paths in memory at once; they can be processed one by one and immediately discarded. In this case, the memory savings would be rather modest, but in others they may be significant. Also, if a loop iterating over a generator is terminated early, the generator is not going to keep searching for more paths that would be later discarded anyway.
just transform your function so it returns a list
and don't return
when something is found. Just add to/extend the list
def getpath(nested_dict, value, prepath=()):
p = []
for k, v in nested_dict.items():
path = prepath + (k,)
if v == value: # found value
p.append(path)
elif hasattr(v, 'items'): # v is a dict
p += getpath(v, value, path) # recursive call
return p
with your input data, this produces (order may vary depending on python versions where dictionaries are unordered):
[('yaxis', 'linecolor'), ('yaxis', 'zerolinecolor'), ('geo', 'lakecolor'),
('geo', 'caxis', 'linecolor'), ('geo', 'caxis', 'gridcolor'), ('geo', 'bgcolor')]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With