One of the columns in DataFrame is an array. How do I flatten it?
column1 column2 column3
var1 var11 [1, 2, 3, 4]
var2 var22 [1, 2, 3, 4, -2, 12]
var3 var33 [1, 2, 3, 4, 33, 544]
After flattening it should be:
column1 column2 column3
var1 var11 1
var1 var11 2
var1 var11 3
var1 var11 4
var2 var22 1
var2 var22 2
var2 var22 3
var2 var22 4
var2 var22 -2
......
var3 var33 544
I seemed unstack
could help me but I couldn't understand how exactly.
Here is one 'one-liner' approach, where df
is your dataframe:
import pandas as pd
df.join(df.column3.apply(pd.Series)).drop('column3', 1).set_index([u'column1', u'column2']).stack().reset_index().drop('level_2', 1).rename(columns={0:'column3'})
yielding:
column1 column2 column3
0 var1 var11 1
1 var1 var11 2
2 var1 var11 3
3 var1 var11 4
4 var2 var22 1
5 var2 var22 2
6 var2 var22 3
7 var2 var22 4
8 var2 var22 -2
9 var2 var22 12
10 var3 var33 1
11 var3 var33 2
12 var3 var33 3
13 var3 var33 4
14 var3 var33 33
15 var3 var33 544
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With