Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dual nested dictionary to stacked DataFrame

Tags:

python

pandas

Setup

A dictionary of the following structural form:

subnetwork_dct = {518418568: {2: (478793912, 518418568, 518758448),
             3: (478793912, 518418568, 518758448, 1037590624),
             4: (478793912, 518418568, 518758448, 1037590624)},
 552214776: {2: (431042800, 552214776),
             3: (431042800,)},
 993280096: {2: (456917000, 993280096),
             3: (456917000, 993280096),
             4: (456917000, 993280096)}}

Expected Output

A Pandas DataFrame following the below schema:

0             1     2
518418568     2     478793912
518418568     2     518418568
518418568     2     518758448
518418568     3     478793912
518418568     3     518418568
518418568     3     518758448
518418568     3     1037590624
518418568     4     478793912
518418568     4     518418568
518418568     4     518758448
518418568     4     1037590624
552214776     2     431042800
552214776     2     552214776
552214776     3     431042800
...

Working solution:

My current approach works, but I wonder if there's a cleaner solution?

import pandas as pd

multi_index_dct = {(k1, k2):v2 for k1,v1 in subnetwork_dct.items() \
                               for k2,v2 in subnetwork_dct[k1].items()}

df = pd.DataFrame([multi_index_dct[i] for i in sorted(multi_index_dct)],
                  index=pd.MultiIndex.from_tuples([i for i in sorted(multi_index_dct.keys())]))    

df_stacked = pd.DataFrame(df.stack()).reset_index()
df_stacked.drop('level_2', axis=1, inplace=True)
df_stacked.columns = [0,1,2]

df_stacked
like image 943
Ian Avatar asked Nov 13 '19 15:11

Ian


People also ask

How do I convert a nested dictionary to a DataFrame?

We first take the list of nested dictionary and extract the rows of data from it. Then we create another for loop to append the rows into the new list which was originally created empty. Finally we apply the DataFrames function in the pandas library to create the Data Frame.

How do you transpose a DF in Python?

Pandas DataFrame: transpose() functionThe transpose() function is used to transpose index and columns. Reflect the DataFrame over its main diagonal by writing rows as columns and vice-versa. If True, the underlying data is copied. Otherwise (default), no copy is made if possible.

How do I create a hierarchical index in pandas?

To make the column an index, we use the Set_index() function of pandas. If we want to make one column an index, we can simply pass the name of the column as a string in set_index(). If we want to do multi-indexing or Hierarchical Indexing, we pass the list of column names in the set_index().


2 Answers

Try with explode after 0.25 pandas

pd.DataFrame(subnetwork_dct).stack().explode().reset_index()
like image 127
BENY Avatar answered Sep 27 '22 17:09

BENY


Comprehension

pd.DataFrame([
    (k0, k1, v) for k0, d in subnetwork_dct.items()
                for k1, V in d.items()
                for v     in V
])

            0  1           2
0   518418568  2   478793912
1   518418568  2   518418568
2   518418568  2   518758448
3   518418568  3   478793912
4   518418568  3   518418568
5   518418568  3   518758448
6   518418568  3  1037590624
7   518418568  4   478793912
8   518418568  4   518418568
9   518418568  4   518758448
10  518418568  4  1037590624
11  552214776  2   431042800
12  552214776  2   552214776
13  552214776  3   431042800
14  993280096  2   456917000
15  993280096  2   993280096
16  993280096  3   456917000
17  993280096  3   993280096
18  993280096  4   456917000
19  993280096  4   993280096
like image 42
piRSquared Avatar answered Sep 27 '22 17:09

piRSquared