Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

unpack dictionary entries in pandas into dataframe

I have a dataframe where one of the columns has a dictionary in it

import pandas as pd
import numpy as np

def generate_dict():
    return {'var1': np.random.rand(), 'var2': np.random.rand()}

data = {}
data[0] = {}
data[1] = {}
data[0]['A'] = generate_dict()
data[1]['A'] = generate_dict()

df = pd.DataFrame.from_dict(data, orient='index')

enter image description here

I would like to unpack the key/value pairs in the dictionary into a new dataframe, where each entry has it's own row. I can do that by iterating over the rows and appending to a new DataFrame:

def expand_row(row):
    df_t = pd.DataFrame.from_dict({'value': row.A})
    df_t.index.rename('row', inplace=True)
    df_t.reset_index(inplace=True)
    df_t['column'] = 'A'
    return df_t

df_expanded = pd.DataFrame([])
for _, row in df.iterrows():
    T = expand_row(row)
    df_expanded = df_expanded.append(T, ignore_index=True)

enter image description here

This is rather slow, and my application is performance critical. I tihnk this is possible with df.apply. However as my function returns a DataFrame instead of a series, simply doing

df_expanded = df.apply(expand_row)

doesn't quite work. What would be the most performant way to do this?

Thanks in advance.

like image 484
RickB Avatar asked Mar 10 '23 09:03

RickB


1 Answers

You can use nested list comprehension and then replace column 0 with constant A (column name):

d = df.A.to_dict()

df1 = pd.DataFrame([(key,key1,val1) for key,val in d.items() for key1,val1 in val.items()])
df1[0] = 'A'
df1.columns = ['columns','row','value']
print (df1)
  columns   row     value
0       A  var1  0.013872
1       A  var2  0.192230
2       A  var1  0.176413
3       A  var2  0.253600

Another solution:

df1 = pd.DataFrame.from_records(df.A.values.tolist()).stack().reset_index()
df1['level_0'] = 'A'
df1.columns = ['columns','row','value']
print (df1)
  columns   row     value
0       A  var1  0.332594
1       A  var2  0.118967
2       A  var1  0.374482
3       A  var2  0.263910
like image 83
jezrael Avatar answered Mar 20 '23 04:03

jezrael