Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Nested Dictionary to MultiIndex pandas DataFrame (3 level)

I would like to do the equivalent of this for a 3 level nested dictionary

Nested dictionary to multiindex dataframe where dictionary keys are column labels

like image 991
baconwichsand Avatar asked May 21 '15 21:05

baconwichsand


1 Answers

Using an example of three level dict

In [1]: import pandas as pd

In [2]: dictionary = {'A': {'a': {1: [2,3,4,5,6],
   ...:                           2: [2,3,4,5,6]},
   ...:                     'b': {1: [2,3,4,5,6],
   ...:                           2: [2,3,4,5,6]}},
   ...:               'B': {'a': {1: [2,3,4,5,6],
   ...:                           2: [2,3,4,5,6]},
   ...:                     'b': {1: [2,3,4,5,6],
   ...:                           2: [2,3,4,5,6]}}}

And the following dictionary comprehension based on the one from the question you linked

In [3]: reform = {(level1_key, level2_key, level3_key): values
   ...:           for level1_key, level2_dict in dictionary.items()
   ...:           for level2_key, level3_dict in level2_dict.items()
   ...:           for level3_key, values      in level3_dict.items()}

Which gives

In [4]: reform
Out[4]:
{('A', 'a', 1): [2, 3, 4, 5, 6],
 ('A', 'a', 2): [2, 3, 4, 5, 6],
 ('A', 'b', 1): [2, 3, 4, 5, 6],
 ('A', 'b', 2): [2, 3, 4, 5, 6],
 ('B', 'a', 1): [2, 3, 4, 5, 6],
 ('B', 'a', 2): [2, 3, 4, 5, 6],
 ('B', 'b', 1): [2, 3, 4, 5, 6],
 ('B', 'b', 2): [2, 3, 4, 5, 6]}

For pandas DataFrame

In [5]: pd.DataFrame(reform)
Out[5]:
   A           B
   a     b     a     b
   1  2  1  2  1  2  1  2
0  2  2  2  2  2  2  2  2
1  3  3  3  3  3  3  3  3
2  4  4  4  4  4  4  4  4
3  5  5  5  5  5  5  5  5
4  6  6  6  6  6  6  6  6

In [6]: df = pd.DataFrame(reform).T
Out[6]:
       0  1  2  3  4
A a 1  2  3  4  5  6
    2  2  3  4  5  6
  b 1  2  3  4  5  6
    2  2  3  4  5  6
B a 1  2  3  4  5  6
    2  2  3  4  5  6
  b 1  2  3  4  5  6
    2  2  3  4  5  6

As you can see, you could increase the number of levels easily by adding another line to the comprehension and new key to tuple.

Bonus: add names to the indexes

In [7]: names=['level1', 'level2', 'level3']

In [8]: df.index.set_names(names, inplace=True)

In [9]: df
Out[9]:
                      0  1  2  3  4
level1 level2 level3
A      a      1       2  3  4  5  6
              2       2  3  4  5  6
       b      1       2  3  4  5  6
              2       2  3  4  5  6
B      a      1       2  3  4  5  6
              2       2  3  4  5  6
       b      1       2  3  4  5  6
              2       2  3  4  5  6
like image 102
DancingQuanta Avatar answered Oct 20 '22 01:10

DancingQuanta