Convert redundant array to dict (or JSON)?

Question

Suppose I have an array:

[['a', 10, 1, 0.1],
 ['a', 10, 2, 0.2],
 ['a', 20, 2, 0.3],
 ['b', 10, 1, 0.4],
 ['b', 20, 2, 0.5]]

And I want a dict (or JSON):

{
    'a': {
        10: {1: 0.1, 2: 0.2},
        20: {2: 0.3}
    }
    'b': {
        10: {1: 0.4},
        20: {2: 0.5}
    }
}

Is there any good way or some library for this task?
In this example the array is just 4-column, but my original array is more complicated (7-column).

Currently I implement this naively:

import pandas as pd
df = pd.DataFrame(array)
grouped1 = df.groupby('column1')
for column1 in grouped1.groups:
    group1 = grouped1.get_group(column1)
    grouped2 = group1.groupby('column2')
    for column2 in grouped2.groups:
        group2 = grouped2.get_group(column2)
        ...

And defaultdict way:

d = defaultdict(lambda x: defaultdict(lambda y: defaultdict ... ))
for row in array:
    d[row[0]][row[1]][row[2]... = row[-1]

But I think neither is smart.

deceze · Accepted Answer

I would suggest this rather simple solution:

from functools import reduce

data = [['a', 10, 1, 0.1],
        ['a', 10, 2, 0.2],
        ['a', 20, 2, 0.3],
        ['b', 10, 1, 0.4],
        ['b', 20, 2, 0.5]]

result = dict()
for row in data:
    reduce(lambda v, k: v.setdefault(k, {}), row[:-2], result)[row[-2]] = row[-1]

print(result)

{'a': {10: {1: 0.1, 2: 0.2}, 20: {2: 0.3}}, 'b': {10: {1: 0.4}, 20: {2: 0.5}}}

An actual recursive solution would be something like this:

def add_to_group(keys: list, group: dict):
    if len(keys) == 2:
        group[keys[0]] = keys[1]
    else:
        add_to_group(keys[1:], group.setdefault(keys[0], dict()))

result = dict()
for row in data:
    add_to_group(row, result)

print(result)

Hai Vu · Answer

Introduction

Here is a recursive solution. The base case is when you have a list of 2-element lists (or tuples), in which case, the dict will do what we want:

>>> dict([(1, 0.1), (2, 0.2)])
{1: 0.1, 2: 0.2}

For other cases, we will remove the first column and recurse down until we get to the base case.

The code:

from itertools import groupby

def rows2dict(rows):
    if len(rows[0]) == 2:
        # e.g. [(1, 0.1), (2, 0.2)] ==> {1: 0.1, 2: 0.2}
        return dict(rows)
    else:
        dict_object = dict()
        for column1, groupped_rows in groupby(rows, lambda x: x[0]):
            rows_without_first_column = [x[1:] for x in groupped_rows]
            dict_object[column1] = rows2dict(rows_without_first_column)
        return dict_object

if __name__ == '__main__':
    rows = [['a', 10, 1, 0.1],
            ['a', 10, 2, 0.2],
            ['a', 20, 2, 0.3],
            ['b', 10, 1, 0.4],
            ['b', 20, 2, 0.5]]
    dict_object = rows2dict(rows)
    print dict_object

Output

{'a': {10: {1: 0.1, 2: 0.2}, 20: {2: 0.3}}, 'b': {10: {1: 0.4}, 20: {2: 0.5}}}

Notes

We use the itertools.groupby generator to simplify grouping of similar rows based on the first column
For each group of rows, we remove the first column and recurse down
This solution assumes that the rows variable has 2 or more columns. The result is unpreditable for rows which has 0 or 1 column.

Convert redundant array to dict (or JSON)?

Tags:

python

json

arrays

list

keisuke

2 Answers

deceze

Introduction

The code:

Output

Notes

Hai Vu

Recent Activity

Donate For Us

Convert redundant array to dict (or JSON)?

Tags:

python

json

arrays

list

keisuke

2 Answers

deceze

Introduction

The code:

Output

Notes

Hai Vu

Related questions

Recent Activity

Donate For Us