I'm trying to figure out if there is a way to rename Pandas columns when you try to reset the index. I see in the documentation that you can use the "name" parameter to set the column name of a reset index if there is only one column, but I'm curious if there is a way to do this for multiple columns.
For example:
df1 = pd.DataFrame({
'A' : ['a1', 'a1', 'a2', 'a3'],
'B' : ['b1', 'b2', 'b3', 'b4'],
'D1' : [1,0,0,0],
'D2' : [0,1,1,0],
'D3' : [0,0,1,1],
})
df1.set_index(['B','A']).stack().reset_index()
The result leaves you with:
B A level_2 0
0 b1 a1 D1 1
1 b1 a1 D2 0
2 b1 a1 D3 0
3 b2 a1 D1 0
4 b2 a1 D2 1
You could do:
df1.set_index(['B','A']).stack().reset_index(name='my_col')
In order to set the name of the last column but I'm wondering if there is a way to use the parameter to set the name of the 'level_2' column as well.
The first thing that came to my mind was to try:
df1.set_index(['B','A']).stack().reset_index(name=['my_col2','my_col'])
However, that did not work so looking for another way around. I realize I could always just rename the columns in the next line but was hoping there'd be a cleaner way to do it in one line.
reset_index
is not smart enough to do this, but we could leverage methods rename_axis
and rename
to give names to the index and columns/series before resetting the index; once the names are set up properly, reset_index will automatically convert these names to the column names in the result:
Here rename_axis
gives names to index which is somewhat equivalent to df.index.names = ...
except in a functional style; rename
gives name to the Series object:
df1.set_index(['B','A']).stack().rename_axis(['B','A','col2']).rename('col').reset_index()
# B A col2 col
#0 b1 a1 D1 1
#1 b1 a1 D2 0
#2 b1 a1 D3 0
#3 b2 a1 D1 0
#4 b2 a1 D2 1
# ..
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With