Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Flattening an array in pandas

Tags:

python

pandas

One of the columns in DataFrame is an array. How do I flatten it?

column1 column2 column3
var1     var11   [1, 2, 3, 4]
var2     var22   [1, 2, 3, 4, -2, 12]
var3     var33   [1, 2, 3, 4, 33, 544]

After flattening it should be:

column1 column2 column3
var1     var11   1
var1     var11   2
var1     var11   3
var1     var11   4
var2     var22   1
var2     var22   2
var2     var22   3
var2     var22   4
var2     var22   -2
......
var3     var33   544

I seemed unstack could help me but I couldn't understand how exactly.

like image 827
Incerteza Avatar asked Mar 15 '15 07:03

Incerteza


1 Answers

Here is one 'one-liner' approach, where df is your dataframe:

import pandas as pd

df.join(df.column3.apply(pd.Series)).drop('column3', 1).set_index([u'column1', u'column2']).stack().reset_index().drop('level_2', 1).rename(columns={0:'column3'})

yielding:

   column1 column2  column3
0     var1   var11        1
1     var1   var11        2
2     var1   var11        3
3     var1   var11        4
4     var2   var22        1
5     var2   var22        2
6     var2   var22        3
7     var2   var22        4
8     var2   var22       -2
9     var2   var22       12
10    var3   var33        1
11    var3   var33        2
12    var3   var33        3
13    var3   var33        4
14    var3   var33       33
15    var3   var33      544
like image 121
Primer Avatar answered Oct 30 '22 16:10

Primer