Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create expanded pandas dataframe from a nested dictionary?

Tags:

python

pandas

I have a nested dictionary and tried to create a pandas dataframe from this, but it gives only two columns, I like all the dictionary keys to be columns.

MWE

import numpy as np
import pandas as pd

history = {'validation_0':
               {'error': [0.06725,0.067,0.067],
                '[email protected]': [0.104125,0.103875,0.103625],
                'auc': [0.92729,0.932045,0.934238],
               },
          'validation_1': 
              {'error': [0.1535,0.151,0.1505],
                '[email protected]': [0.239,0.239,0.239],
                'auc': [0.898305,0.905611,0.909242]
               }
          }


df = pd.DataFrame(history)
print(df)
                             validation_0                    validation_1
error             [0.06725, 0.067, 0.067]         [0.1535, 0.151, 0.1505]
[email protected]  [0.104125, 0.103875, 0.103625]           [0.239, 0.239, 0.239]
auc         [0.92729, 0.932045, 0.934238]  [0.898305, 0.905611, 0.909242]

Required

dataframe with following columns:
validation_0_error validation_1_error [email protected] [email protected]  validation_0_auc validation_1_auc
like image 499
BhishanPoudel Avatar asked Dec 18 '22 12:12

BhishanPoudel


2 Answers

You can also explode it after json_normalize:

print (pd.json_normalize(history).apply(pd.Series.explode).reset_index(drop=True))

  validation_0.error [email protected] validation_0.auc validation_1.error [email protected] validation_1.auc
0            0.06725               0.104125          0.92729             0.1535                  0.239         0.898305
1              0.067               0.103875         0.932045              0.151                  0.239         0.905611
2              0.067               0.103625         0.934238             0.1505                  0.239         0.909242
like image 80
Henry Yik Avatar answered Feb 06 '23 18:02

Henry Yik


Let's try:

a = df.unstack()

pd.DataFrame(a.values.tolist(), index=a.index).T

Also if you start from history:

pd.concat({k:pd.DataFrame(v) for k,v in history.items()}, axis=1)

Output:

                      validation_0                     validation_1                    
         error  [email protected]      auc        error [email protected]       auc
0      0.06725  0.104125  0.927290       0.1535     0.239  0.898305
1      0.06700  0.103875  0.932045       0.1510     0.239  0.905611
2      0.06700  0.103625  0.934238       0.1505     0.239  0.909242
like image 37
Quang Hoang Avatar answered Feb 06 '23 18:02

Quang Hoang