Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas explode multiple columns

I have DF that has multiple columns. Two of the columns are list of the same len.( col2 and col3 are list. the len of the list is the same).

My goal is to list each element on it's own row.

I can use the df.explode(). but it only accepts one column. However, I want the pair of the two columns to be 'exploded'. If I do df.explode('col2') and then df.explode('col3'), it results it 9 rows instead of 3.

Original DF

col0      col1        col2        col3
1       aa          [1,2,3]     [1.1,2.2,3.3]
2       bb          [4,5,6]     [4.4,5.5,6.6]
3       cc          [7,8,9]     [7.7,8.8,9.9]
3       cc          [7,8,9]     [7.7,8.8,9.9]

End DataFrame

id      col1        col2        col3
1       aa          1           1.1
1       aa          2           2.2
1       aa          3           3.3
2       bb          4           4.4
2       bb          5           5.5
2       bb          6           6.6
3       cc          ...         ...

Update None of the column have unique values, so can't be used as index.

like image 852
Imsa Avatar asked Jul 08 '20 18:07

Imsa


People also ask

How do you explode multiple columns?

Column(s) to explode. For multiple columns, specify a non-empty list with each element be str or tuple, and all specified columns their list-like data on same row of the frame must have matching length. If True, the resulting index will be labeled 0, 1, …, n - 1. New in version 1.1.

Can you pop multiple columns pandas?

pop returns a Series, so you can only pop a single column.

How do I explode a column in pandas?

Pandas DataFrame: explode() functionThe explode() function is used to transform each element of a list-like to a row, replicating the index values. Exploded lists to rows of the subset columns; index will be duplicated for these rows. Raises: ValueError - if columns of the frame are not unique.


1 Answers

You could set col1 as index and apply pd.Series.explode across the columns:

df.set_index('col1').apply(pd.Series.explode).reset_index()

Or:

df.apply(pd.Series.explode)


   col1 col2 col3
0    aa    1  1.1
1    aa    2  2.2
2    aa    3  3.3
3    bb    4  4.4
4    bb    5  5.5
5    bb    6  6.6
6    cc    7  7.7
7    cc    8  8.8
8    cc    9  9.9
9    cc    7  7.7
10   cc    8  8.8
11   cc    9  9.9
like image 150
yatu Avatar answered Oct 01 '22 00:10

yatu