I would like to melt several groups of columns of a dataframe into multiple target columns. Similar to questions Python Pandas Melt Groups of Initial Columns Into Multiple Target Columns and pandas dataframe reshaping/stacking of multiple value variables into seperate columns. However I need to do this explicitly by column name, rather than by index location. <pre class="prettyprint"><code>import pandas as pd df = pd.DataFrame([('a','b','c',1,2,3,'aa','bb','cc'), ('d', 'e', 'f', 4, 5, 6, 'dd', 'ee', 'ff')], columns=['a_1', 'a_2', 'a_3','b_1', 'b_2', 'b_3','c_1', 'c_2', 'c_3']) df </code></pre> Original Dataframe: <pre class="prettyprint"><code> id a_1 a_2 a_3 b_1 b_2 b_3 c_1 c_2 c_3 0 101 a b c 1 2 3 aa bb cc 1 102 d e f 4 5 6 dd ee ff </code></pre> Target Dataframe <pre class="prettyprint"><code> id a b c 0 101 a 1 aa 1 101 b 2 bb 2 101 c 3 cc 3 102 d 4 dd 4 102 e 5 ee 5 102 f 6 ff </code></pre> Advice is much appreciated on an approach to this.

There is a more efficient way to do these type of problems that involve melting multiple different sets of columns. <code>pd.wide_to_long</code> is built for these exact situations. <pre class="prettyprint"><code>pd.wide_to_long(df, stubnames=['a', 'b', 'c'], i='id', j='dropme', sep='_')\ .reset_index()\ .drop('dropme', axis=1)\ .sort_values('id') id a b c 0 101 a 1 aa 2 101 b 2 bb 4 101 c 3 cc 1 102 d 4 dd 3 102 e 5 ee 5 102 f 6 ff </code></pre>

Pandas Melt several groups of columns into multiple target columns by name

Tags:

python

pandas

melt

I would like to melt several groups of columns of a dataframe into multiple target columns. Similar to questions Python Pandas Melt Groups of Initial Columns Into Multiple Target Columns and pandas dataframe reshaping/stacking of multiple value variables into seperate columns. However I need to do this explicitly by column name, rather than by index location.

import pandas as pd
df = pd.DataFrame([('a','b','c',1,2,3,'aa','bb','cc'), ('d', 'e', 'f', 4, 5, 6, 'dd', 'ee', 'ff')],
                  columns=['a_1', 'a_2', 'a_3','b_1', 'b_2', 'b_3','c_1', 'c_2', 'c_3'])
df

Original Dataframe:

    id   a_1  a_2  a_3  b_1  b_2  b_3  c_1  c_2  c_3
0   101   a    b    c    1    2    3    aa   bb   cc
1   102   d    e    f    4    5    6    dd   ee   ff

Target Dataframe

     id   a   b   c
0   101   a   1   aa
1   101   b   2   bb
2   101   c   3   cc
3   102   d   4   dd
4   102   e   5   ee
5   102   f   6   ff

Advice is much appreciated on an approach to this.

716

asked Aug 10 '16 01:08

Nick D

2 Answers

There is a more efficient way to do these type of problems that involve melting multiple different sets of columns. pd.wide_to_long is built for these exact situations.

pd.wide_to_long(df, stubnames=['a', 'b', 'c'], i='id', j='dropme', sep='_')\
  .reset_index()\
  .drop('dropme', axis=1)\
  .sort_values('id')

    id  a  b   c
0  101  a  1  aa
2  101  b  2  bb
4  101  c  3  cc
1  102  d  4  dd
3  102  e  5  ee
5  102  f  6  ff

120

answered Sep 28 '22 11:09

Ted Petrou

You can convert the column names to multi index based on the columns pattern and then stack at a particular level depending on the result you need:

import pandas as pd
df.set_index('id', inplace=True)
df.columns = pd.MultiIndex.from_tuples(tuple(df.columns.str.split("_")))
df.stack(level = 1).reset_index(level = 1, drop = True).reset_index()

# id    a   b    c      
#101    a   1   aa
#101    b   2   bb
#101    c   3   cc
#102    d   4   dd
#102    e   5   ee
#102    f   6   ff

answered Sep 28 '22 10:09

Psidom

Related questions
                            
                                Modify object in python multiprocessing
                            
                                Assigning multiple column values in a single row of pandas DataFrame, in one line
                            
                                How to load a custom JS file in Django admin home?
                            
                                Django South Error: AttributeError: 'DateTimeField' object has no attribute 'model'`
                            
                                Django rest framework api_view vs normal view
                            
                                Debugging Python Fatal Error: GC Object already Tracked
                            
                                trace python: only include some files
                            
                                python 3 map/lambda method with 2 inputs
                            
                                Using win32com with multithreading
                            
                                How to use lambda as method within a class?
                            
                                How to use Paramiko logging?
                            
                                Is passing too many arguments to the constructor considered an anti-pattern?
                            
                                Pandas to_csv call is prepending a comma
                            
                                How to decrease colorbar WIDTH in matplotlib?
                            
                                Await Future from Executor: Future can't be used in 'await' expression
                            
                                PyCharm print end='\r' statement not working
                            
                                Django Pandas to http response (download file)
                            
                                How to download google image search results in Python
                            
                                Python: how to specify output folders in Pyinstaller .spec file
                            
                                Is it possible to stream video from https:// (e.g. YouTube) into python with OpenCV?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With