I have a JSON object in Python represented as a nested lists of dictionaries. (Some of the values of the dictionary are dictionaries themselves, and so on.)
I want to be able to search for a key on all branches of this nested dictionary structure.
When I find the key I want to be able to return the full key path that leads to it.
For example: I'm looking for "special agents" who have a "special address key", but not all special agents have it, and those that do have it in inconsistent paths in their JSON.
So I search for key Special Address code
.
The result should return:
/'People'/'SpecialAgents'/'007'/'Special Address code'/
So I will be able to reach its information in that way:
json_obj['People']['SpecialAgents']['007']['Special Address code']
Note that this is similar to this question but I need the full path to each instance of the key found.
You need a recursive search.
You can define a function to deeply search in your input json:
def find_in_obj(obj, condition, path=None):
if path is None:
path = []
# In case this is a list
if isinstance(obj, list):
for index, value in enumerate(obj):
new_path = list(path)
new_path.append(index)
for result in find_in_obj(value, condition, path=new_path):
yield result
# In case this is a dictionary
if isinstance(obj, dict):
for key, value in obj.items():
new_path = list(path)
new_path.append(key)
for result in find_in_obj(value, condition, path=new_path):
yield result
if condition == key:
new_path = list(path)
new_path.append(key)
yield new_path
We can then use the example JSON in this similar SO question to test the recursive search:
In [15]: my_json = { "id" : "abcde",
....: "key1" : "blah",
....: "key2" : "blah blah",
....: "nestedlist" : [
....: { "id" : "qwerty",
....: "nestednestedlist" : [
....: { "id" : "xyz",
....: "keyA" : "blah blah blah" },
....: { "id" : "fghi",
....: "keyZ" : "blah blah blah" }],
....: "anothernestednestedlist" : [
....: { "id" : "asdf",
....: "keyQ" : "blah blah" },
....: { "id" : "yuiop",
....: "keyW" : "blah" }] } ] }
Let's find every instance of the key 'id' and return the full path that gets us there:
In [16]: for item in find_in_obj(my_json, 'id'):
....: print item
....:
['nestedlist', 0, 'nestednestedlist', 0, 'id']
['nestedlist', 0, 'nestednestedlist', 1, 'id']
['nestedlist', 0, 'id']
['nestedlist', 0, 'anothernestednestedlist', 0, 'id']
['nestedlist', 0, 'anothernestednestedlist', 1, 'id']
['id']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With