I'm trying to figure out if there is a way to rename Pandas columns when you try to reset the index. I see in the documentation that you can use the "name" parameter to set the column name of a reset index if there is only one column, but I'm curious if there is a way to do this for multiple columns.
For example:
df1 = pd.DataFrame({
'A' : ['a1', 'a1', 'a2', 'a3'],
'B' : ['b1', 'b2', 'b3', 'b4'],
'D1' : [1,0,0,0],
'D2' : [0,1,1,0],
'D3' : [0,0,1,1],
})
df1.set_index(['B','A']).stack().reset_index()
The result leaves you with:
     B   A level_2  0
0   b1  a1      D1  1
1   b1  a1      D2  0
2   b1  a1      D3  0
3   b2  a1      D1  0
4   b2  a1      D2  1
You could do:
df1.set_index(['B','A']).stack().reset_index(name='my_col')
In order to set the name of the last column but I'm wondering if there is a way to use the parameter to set the name of the 'level_2' column as well.
The first thing that came to my mind was to try:
df1.set_index(['B','A']).stack().reset_index(name=['my_col2','my_col'])
However, that did not work so looking for another way around. I realize I could always just rename the columns in the next line but was hoping there'd be a cleaner way to do it in one line.
reset_index is not smart enough to do this, but we could leverage methods rename_axis and rename to give names to the index and columns/series before resetting the index; once the names are set up properly, reset_index will automatically convert these names to the column names in the result:
Here rename_axis gives names to index which is somewhat equivalent to df.index.names = ... except in a functional style; rename gives name to the Series object:
df1.set_index(['B','A']).stack().rename_axis(['B','A','col2']).rename('col').reset_index()
#    B   A  col2    col
#0  b1  a1    D1    1
#1  b1  a1    D2    0
#2  b1  a1    D3    0
#3  b2  a1    D1    0
#4  b2  a1    D2    1
# ..
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With