Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

flatten nested Python dictionaries, compressing keys, and recuring into sub-lists with dicts

I've been using imran's great answer to flatten nested Python dictionaries, compressing keys and am trying to think of a way to further flatten the dictionaries that may be inside list values of the dictionary items.
(Of course, as my data is usually coming from XML this can also be recursive...)

from pprint import pprint
from collections import MutableMapping

def flatten(d, parent_key='', sep='_'):
    items = []
    for k, v in d.items():
        new_key = parent_key + sep + k if parent_key else k
        if isinstance(v, MutableMapping):
            items.extend(flatten(v, new_key, sep=sep).items())
        else:
            items.append((new_key, v))
    return dict(items)

Given a dict d like this:

d = {"a": 1,
     "b": 2,
     "c": {"sub-a": "one",
           "sub-b": "two",
           "sub-c": "thre"}}

this works great:

pprint(flatten(d))

        {'a': 1,
         'b': 2,
         'c_sub-a': 'one',
         'c_sub-b': 'two',
         'c_sub-c': 'thre'}

However, I would like to further recur through a list value of the dict items, and inspect if that each dict in the list can be further flattened.

Here's an example of sample input with c-list as a nested list value:

d = {"a": 1,
     "b": 2,
     "c-list": [
         {"id": 1, "nested": {"sub-a": "one", "sub-b": "two", "sub-c": "thre"} },
         {"id": 2, "nested": {"sub-a": "one", "sub-b": "two", "sub-c": "thre"} },
         {"id": 3, "nested": {"sub-a": "one", "sub-b": "two", "sub-c": "thre"} }]}

Here's what I currently get with the function above:

pprint(flatten(d))

{'a': 1,
 'b': 2,
 'c-list': [{'id': 1, 'nested': {'sub-a': 'one', 'sub-b': 'two', 'sub-c': 'thre'}},
            {'id': 2, 'nested': {'sub-a': 'one', 'sub-b': 'two', 'sub-c': 'thre'}},
            {'id': 3, 'nested': {'sub-a': 'one', 'sub-b': 'two', 'sub-c': 'thre'}}]}

Below is the output I'm looking for, retaining all the functionality of the original flatten():

{'a': 1,
 'b': 2,
 'c-list': [{'id': 1, 'nested_sub-a': 'one', 'nested_sub-b': 'two', 'nested_sub-c': 'thre'},
            {'id': 2, 'nested_sub-a': 'one', 'nested_sub-b': 'two', 'nested_sub-c': 'thre'},
            {'id': 3, 'nested_sub-a': 'one', 'nested_sub-b': 'two', 'nested_sub-c': 'thre'}]}

I'm struggling to figure out how to recursively "re-assemble" the dict into this when it contains lists... any tips appreciated.

like image 842
joefromct Avatar asked Oct 18 '22 02:10

joefromct


1 Answers

You were really close, if a value is a list, then a single line is needed to gets you to a recursive version of flatten:

items.append((new_key, map(flatten, v)))  # for python 2.x
# or
items.append((new_key, list(map(flatten, v))))  # for python 3.x

So, you simply recursively call the function on each element.

Here is how flatten then would look like:

def flatten(d, parent_key='', sep='_'):
    items = []
    for k, v in d.items():
        new_key = '{0}{1}{2}'.format(parent_key,sep,k) if parent_key else k
        if isinstance(v, MutableMapping):
            items.extend(flatten(v, new_key, sep=sep).items())
        elif isinstance(v, list):
            # apply itself to each element of the list - that's it!
            items.append((new_key, map(flatten, v)))
        else:
            items.append((new_key, v))
    return dict(items)

This solution can cope with an arbitrary depth of lists in lists.

like image 130
jojo Avatar answered Oct 21 '22 00:10

jojo