I receive from another script a dictionary containing various types, in particular other dictionaries or lists that might contain other dictionaries as values.
Now what I want to do is create a single flat dictionary. Keys might be present multiple times within the encapsulated dictionaries. For me the inner most key holds the newest information, so I think dict.update
is the right routine to apply when digesting a 'inner' dict. By 'inner' dict I mean a dictionary withing some value of the outermost dictionary.
Now, I understand how to flatten a dictionary by 1 level. What I struggle with to to flatten it by arbitrarily many levels.
A simple example example of the type of dictionary I'm dealing with is:
d = {1: {6: {7: {2: {'a'}}}}, 2: 'b', 3: {4: {2: 'c'}}, 5: ['a', 'b', {1: 'a'}]}
My attempt works ok for a single level of depth:
dd = dict()
for k, v in d.items():
if isinstance(v, dict):
dd.update(v)
elif isinstance(v, list):
for el in v:
if isinstance(el, dict):
dd.update(el)
dd[k] = [el for el in v if not isinstance(el, dict)]
else:
dd[k] = v
This gives me:
Out[56]: {6: {7: {2: {'a'}}}, 2: 'b', 4: {2: 'c'}, 1: 'a', 5: ['a', 'b']}
What it should give is:
{2: 'a', 5: ['a', 'b']}
Note the value of the key 2
: 'c'
and not (as I get now) 'b'
. This should be because the inner-most value for the key 2
is 'c'
and not 'b'
.
I'm not just looking to get a functioning code (although this would allow me to continue working) but I'd like to understand how such a problem is tackled in python. I have to admit that I'm a little lost here...
Any help is greatly appreciated!
You can use recursion with a generator and keep a counter to determine the depth:
d = {1: {6: {7: {2: {'a'}}}}, 2: 'b', 3: {4: {2: 'c'}}, 5: ['a', 'b', {1: 'a'}]}
def flatten(_d, _depth = 0):
for a, b in _d.items():
if isinstance(b, list):
yield [a, [i for i in b if not isinstance(i, dict)], _depth]
for c in b:
if isinstance(c, dict):
yield from flatten(c, _depth+1)
elif isinstance(b, dict):
yield from flatten(b, _depth+1)
else:
yield [a, b, _depth]
_result = {}
for a, b, c in flatten(d):
if a not in _result:
_result[a] = [b, c]
else:
if _result[a][-1] < c:
_result[a] = [b, c]
print({a:b for a, [b, c] in _result.items()})
Output:
{2: {'a'}, 5: ['a', 'b'], 1: 'a'}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With