Pandas melt multiple columns to tabulate a dataset

Tags:

pandas

I have a dataset: enter image description here

import pandas as pd
df = pd.DataFrame({'id':[1,2,3],
               'M_start_date_1':[201709,201709, 201709],
               'M_end_date_1':[201905, 201905, 201905],
               'M_start_date_2':[202004, 202004, 202004],
               'M_end_date_2':[202005, 202005, 202005],
               'F_start_date_1':[201803, 201803, 201803],
               'F_end_date_1':[201904, 201904, 201904],
               'F_start_date_2':[201912, 201912, 201912],
               'F_end_date_2':[202007, 202007, 202007],                   
               })

I need to tabulate it and create a new column based on prefix in columns [1:], to get this output: enter image description here

I was trying to use pandas.melt function but got stuck with multiple variables. Did someone worked with this function for multiple columns or there is another way to obtain the output?

462

asked Sep 25 '20 09:09

1 Answers

Main idea is convert id column to index, then split all another columns by _ for MultiIndex and DataFrame.stack, then for correct order is used DataFrame.sort_index, remove unnecessary levels by DataFrame.reset_index, set index names for new columns names by DataFrame.rename_axis and last convert it to columns:

df1 = df.set_index('id')
df1.columns = df1.columns.str.split('_', expand=True)
df1 = (df1.stack(level=[0,2,3])
          .sort_index(level=[0,1], ascending=[True, False])
          .reset_index(level=[2,3], drop=True)
          .sort_index(axis=1, ascending=False)
          .rename_axis(['id','cod'])
          .reset_index())
print (df1)
    id cod   start     end
0    1   M  201709  201905
1    1   M  202004  202005
2    1   F  201803  201904
3    1   F  201912  202007
4    2   M  201709  201905
5    2   M  202004  202005
6    2   F  201803  201904
7    2   F  201912  202007
8    3   M  201709  201905
9    3   M  202004  202005
10   3   F  201803  201904
11   3   F  201912  202007

169

answered Nov 15 '22 04:11

jezrael

Related questions
                            
                                TypeError: Could not build a TypeSpec for a column
                            
                                Adding dates into a hovertemplate plotly
                            
                                Why are double curly braces used instead of backslash in python f-strings?
                            
                                Combining Python trace information and logging
                            
                                Is there a way to download video while keeping their chapters metadata? [closed]
                            
                                Package Python3.7 is not available
                            
                                error: Could not find a version that satisfies the requirement pprint (from -r requirements.txt (line 67)) (from versions: none)
                            
                                How to extract feature vector from single image in Pytorch?
                            
                                Yielding asyncio generator data back from event loop possible?
                            
                                Python Serverless Function Vercel - Next.js
                            
                                Google Calendar API drops "conferenceData" nested object
                            
                                Send a pandas dataframe to slack
                            
                                Python argparse select a list from choices
                            
                                Is there a convention for indicating a quantity's units in Python code?
                            
                                Extracting data from list in Python, after BeautifulSoup scrape, and creating Pandas table
                            
                                Pandas in df column extract string after colon if colon exits; if not, keep text
                            
                                Expand pandas dataframe and consolidate columns
                            
                                Pandas dataframe groupby make a list or array of a column
                            
                                python import path for sub modules if put in namespace package
                            
                                Python / Pyspark - Correct method chaining order rules

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas melt multiple columns to tabulate a dataset

Tags:

python

pandas

Vero

People also ask

1 Answers

jezrael

Recent Activity

Donate For Us