So I have a dictionary:
{'a': {'b': {'c': 'd', 'e': 'f'}}}
I need to create a dictionary as follows:
{'c':'d', 'e','f'}
It can go deeper up to any level but I should always get the key value pair at the maximum depth. So I wrote a function:
def boil_down_array(key, data):
if type(data) == dict:
for key, item in data.items():
boil_down_array(key, item)
else:
yield {key:data}
Now the problem is, once it goes in recursion, the yield
is lost. How do I yield that dictionary again? All I get is a generator which is not what I want.
Yes you can have recursive generators.
The Yield keyword in Python is similar to a return statement used for returning values or objects in Python. However, there is a slight difference. The yield statement returns a generator object to the one who calls the function which contains yield, instead of simply returning a value.
Recursive Functions in Python A recursive function is a function defined in terms of itself via self-referential expressions. This means that the function will continue to call itself and repeat its behavior until some condition is met to return a result.
In the above example, factorial() is a recursive function as it calls itself. When we call this function with a positive integer, it will recursively call itself by decreasing the number. Each function multiplies the number with the factorial of the number below it until it is equal to one.
Use yield from
with your recursive call, otherwise you are just ignoring the result of the recursive call:
def boil_down_array(key, data):
if type(data) == dict:
for key, item in data.items():
yield from boil_down_array(key, item)
else:
yield {key: data}
This is only available in Python > 3.3, but essentially just a short hand for simply yielding from an extra loop:
for key, item in data.items():
for x in boil_down_array(key, item): # just exhaust the recursive generator
yield x # and "re-yield" what it produces
In order to achieve your desired data structure, you might be better off yielding pairs instead of dicts
, so you can transform the result more easily to the resulting dict
:
yield key, data
Then you can use it like:
result = dict(boil_down_array(None, input_dict))
An even simpler recursive approach will just return a complete dict
:
def boil_down_nested(dct):
result = {}
for k, v in dct.items():
if isinstance(v, dict):
result.update(boil_down_nested(v))
else:
result[k] = v
return result
You are ignoring the generator object that the recursive call produces:
for key, item in data.items():
boil_down_array(key, item) # creates a generator object
so the recursive call is not actually executed (the code in your generator is never executed for that call).
You want to use yield from
to delegate iteration to that call:
for key, item in data.items():
yield from boil_down_array(key, item)
yield from
moves control from the current generator to the iterator that the expression after yield from
produces; here that's your recursive generator.
yield from
requires Python 3.3 or newer. If you are using Python 2 or an older Python 3 release, you can also add another loop to explicitly yield each result produced by iteration:
for key, item in data.items():
for result in boil_down_array(key, item):
yield result
I'd also use isinstance(data, dict)
rather than use type(...) ==
, to allow for subclasses:
def boil_down_array(key, data):
if isinstance(data, dict):
for key, item in data.items():
yield from boil_down_array(key, item)
else:
yield {key: data}
Note that your code doesn't actually produce a dictionary as output. It produces an iterable of single key-value dictionaries:
>>> d = {'a': {'b': {'c': 'd', 'e': 'f'}}}
>>> list(boil_down_array('v', d))
[{'c': 'd'}, {'e': 'f'}]
The key
argument from the outermost call is redundant here too, as you replace it with the key of the current iteration.
If you do need to stick with a generator function, then at least produce (key, value)
tuples and don't bother with recursing when the value is not a dictionary (so test before you recurse), to remove the need to pass along the key; the remaining data
argument is now assumed to be a dictionary, always:
def boil_down_nested(data):
for key, value in data.items():
if isinstance(value, dict):
yield from boil_down_nested(value)
else:
yield (key, value)
and use dict(boil_down_nested(input_dict))
to produce a new dictionary from the key-value tuples the generator now outputs:
>>> next(boil_down_nested(d)) # first resulting key-value pair
('c', 'd')
>>> dict(boil_down_nested(d)) # all key-value pairs into a dictionary.
{'c': 'd', 'e': 'f'}
Without recursion, you can use a stack to track nested dictionaries still to process; this makes it much easier to just output a dictionary as a result directly:
def boil_down_nested_dict(d):
stack = [d]
output = {}
while stack:
for key, value in stack.pop().items():
if isinstance(value, dict):
stack.append(value) # process this value next
else:
output[key] = value
return output
No separate dict()
call required anymore:
>>> boil_down_nested_dict(d)
{'c': 'd', 'e': 'f'}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With