Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dictionary: How to list every key path that contains a certain value?

Let's say I've got a nested dictionary of the form:

{'geo': {'bgcolor': 'white','lakecolor': 'white','caxis': {'gridcolor': 'white', 'linecolor': 'white',}},
 'title': {'x': 0.05},
 'yaxis': {'automargin': True,'linecolor': 'white','zerolinecolor': 'white','zerolinewidth': 2}
 }

How can you work your way through that dict and make a list of each complete key path that contains the value 'white'? Using a function defined by user jfs in the post Search for a value in a nested dictionary python lets you check whether or not 'white' occurs at least one time and also returns the path:

# dictionary
    d={'geo': {'bgcolor': 'white','lakecolor': 'white','caxis': {'gridcolor': 'white', 'linecolor': 'white',}},
    'title': {'x': 0.05},
    'yaxis': {'automargin': True,'linecolor': 'white','ticks': '','zerolinecolor': 'white','zerolinewidth': 2}
  }

# function:
def getpath(nested_dict, value, prepath=()):
    for k, v in nested_dict.items():
        path = prepath + (k,)
        if v == value: # found value
            return path
        elif hasattr(v, 'items'): # v is a dict
            p = getpath(v, value, path) # recursive call
            if p is not None:
                return p
getpath(d,'white')

# out:
('geo', 'bgcolor')

But 'white' occurs other places too, like in :

1. d['geo']['lakecolor']

2: d['geo']['caxis']['gridcolor']

3: d['yaxis']['linecolor']

How can I make sure that the function finds all paths?

I've tried applying the function above until it returns none while eliminating found paths one by one, but that quickly turned into an ugly mess.

Thank you for any suggestions!

like image 430
vestland Avatar asked Oct 18 '25 13:10

vestland


2 Answers

This is a perfect use case to write a generator:

def find_paths(haystack, needle):
    if haystack == needle:
        yield ()
    if not isinstance(haystack, dict):
        return
    for key, val in haystack.items():
        for subpath in find_paths(val, needle):
            yield (key, *subpath)

You can use it as follows:

d = {
    'geo': {'bgcolor': 'white','lakecolor': 'white','caxis': {'gridcolor': 'white', 'linecolor': 'white',}},
    'title': {'x': 0.05},
    'yaxis': {'automargin': True,'linecolor': 'white','ticks': '','zerolinecolor': 'white','zerolinewidth': 2}
}

# you can iterate over the paths directly...
for path in find_paths(d, 'white'):
    print('found at path: ', path)

# ...or you can collect them into a list:
paths = list(find_paths(d, 'white'))
print('found at paths: ' + repr(paths))

The generator approach has the advantage that it doesn't need to create an object to keep all paths in memory at once; they can be processed one by one and immediately discarded. In this case, the memory savings would be rather modest, but in others they may be significant. Also, if a loop iterating over a generator is terminated early, the generator is not going to keep searching for more paths that would be later discarded anyway.

like image 85
user3840170 Avatar answered Oct 20 '25 03:10

user3840170


just transform your function so it returns a list and don't return when something is found. Just add to/extend the list

def getpath(nested_dict, value, prepath=()):
    p = []
    for k, v in nested_dict.items():
        path = prepath + (k,)
        if v == value: # found value
            p.append(path)
        elif hasattr(v, 'items'): # v is a dict
            p += getpath(v, value, path) # recursive call
    return p

with your input data, this produces (order may vary depending on python versions where dictionaries are unordered):

[('yaxis', 'linecolor'), ('yaxis', 'zerolinecolor'), ('geo', 'lakecolor'), 
('geo', 'caxis', 'linecolor'), ('geo', 'caxis', 'gridcolor'), ('geo', 'bgcolor')]
like image 42
Jean-François Fabre Avatar answered Oct 20 '25 02:10

Jean-François Fabre



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!