I have DF that has multiple columns. Two of the columns are list of the same len.( col2 and col3 are list. the len of the list is the same). My goal is to list each element on it's own row. I can use the <code>df.explode()</code>. but it only accepts one column. However, I want the pair of the two columns to be 'exploded'. If I do <code>df.explode('col2')</code> and then <code>df.explode('col3')</code>, it results it 9 rows instead of 3. Original DF <pre class="prettyprint"><code>col0 col1 col2 col3 1 aa [1,2,3] [1.1,2.2,3.3] 2 bb [4,5,6] [4.4,5.5,6.6] 3 cc [7,8,9] [7.7,8.8,9.9] 3 cc [7,8,9] [7.7,8.8,9.9] </code></pre> End DataFrame <pre class="prettyprint"><code>id col1 col2 col3 1 aa 1 1.1 1 aa 2 2.2 1 aa 3 3.3 2 bb 4 4.4 2 bb 5 5.5 2 bb 6 6.6 3 cc ... ... </code></pre> Update None of the column have unique values, so can't be used as index.

You could set <code>col1</code> as index and apply <code>pd.Series.explode</code> across the columns: <pre class="prettyprint"><code>df.set_index('col1').apply(pd.Series.explode).reset_index() </code></pre> Or: <pre class="prettyprint"><code>df.apply(pd.Series.explode) col1 col2 col3 0 aa 1 1.1 1 aa 2 2.2 2 aa 3 3.3 3 bb 4 4.4 4 bb 5 5.5 5 bb 6 6.6 6 cc 7 7.7 7 cc 8 8.8 8 cc 9 9.9 9 cc 7 7.7 10 cc 8 8.8 11 cc 9 9.9 </code></pre>

Pandas explode multiple columns

Tags:

python

pandas

dataframe

I have DF that has multiple columns. Two of the columns are list of the same len.( col2 and col3 are list. the len of the list is the same).

My goal is to list each element on it's own row.

I can use the df.explode(). but it only accepts one column. However, I want the pair of the two columns to be 'exploded'. If I do df.explode('col2') and then df.explode('col3'), it results it 9 rows instead of 3.

Original DF

col0      col1        col2        col3
1       aa          [1,2,3]     [1.1,2.2,3.3]
2       bb          [4,5,6]     [4.4,5.5,6.6]
3       cc          [7,8,9]     [7.7,8.8,9.9]
3       cc          [7,8,9]     [7.7,8.8,9.9]

End DataFrame

id      col1        col2        col3
1       aa          1           1.1
1       aa          2           2.2
1       aa          3           3.3
2       bb          4           4.4
2       bb          5           5.5
2       bb          6           6.6
3       cc          ...         ...

Update None of the column have unique values, so can't be used as index.

852

asked Jul 08 '20 18:07

Imsa

1 Answers

You could set col1 as index and apply pd.Series.explode across the columns:

df.set_index('col1').apply(pd.Series.explode).reset_index()

Or:

df.apply(pd.Series.explode)


   col1 col2 col3
0    aa    1  1.1
1    aa    2  2.2
2    aa    3  3.3
3    bb    4  4.4
4    bb    5  5.5
5    bb    6  6.6
6    cc    7  7.7
7    cc    8  8.8
8    cc    9  9.9
9    cc    7  7.7
10   cc    8  8.8
11   cc    9  9.9

150

answered Oct 01 '22 00:10

yatu

Related questions
                            
                                Pandas 'isin' with output keeping order of input list
                            
                                TypeError: can't pickle generator objects
                            
                                Celery & RabbitMQ running as docker containers: Received unregistered task of type '...'
                            
                                Referring to existing distutils options inside setup.cfg and setup.py
                            
                                How to unit-test decorated functions?
                            
                                what is the pythonic way to inherit context manager
                            
                                How can I locale-format a python Decimal and preserve its precision?
                            
                                Python asyncio: yield from wasn't used with future?
                            
                                python log formatter that shows all kwargs in extra
                            
                                gensim word2vec accessing in/out vectors
                            
                                How to read webcam in separate process on OSX?
                            
                                How to extract text from a directory of PDF files efficiently with OCR?
                            
                                Using K-means with cosine similarity - Python
                            
                                what is the purpose of conda inside a container?
                            
                                Understanding output from statsmodels grangercausalitytests
                            
                                Convert Non-Searchable Pdf to Searchable Pdf in Windows Python
                            
                                ModuleNotFoundError: No module named 'pyodbc' when importing pyodbc into py script
                            
                                half (not split!) violin plots in seaborn
                            
                                how to use c++ code in flutter (android) application?
                            
                                Printed output not displayed when using joblib in jupyter notebook

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With