Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Is there a way to use something like 'droplevel' and in process, rename the other level using the dropped level labels as prefix/suffix?

Screenshot of the query below:

Groupby Query

Is there a way to easily drop the upper level column index and a have a single level with labels such as points_prev_amax, points_prev_amin, gf_prev_amax, gf_prev_amin and so on?

like image 517
hkhare Avatar asked Sep 09 '16 07:09

hkhare


People also ask

How do I rename a panda label?

You can use the rename() method of pandas. DataFrame to change column/index name individually. Specify the original name and the new name in dict like {original name: new name} to columns / index parameter of rename() . columns is for the column name, and index is for the index name.

What method will you use to rename the index or columns of Pandas DataFrame?

Pandas rename() method is used to rename any index, column or row.

How do you drop a level in Pandas?

By using DataFrame. droplevel() or DataFrame. columns. droplevel() you can drop a level from multi-level column index from pandas DataFrame.


1 Answers

Use list comprehension for set new column names:

df.columns = df.columns.map('_'.join)

Or:

df.columns = ['_'.join(col) for col in df.columns]

Sample:

df = pd.DataFrame({'A':[1,2,2,1],
                   'B':[4,5,6,4],
                   'C':[7,8,9,1],
                   'D':[1,3,5,9]})

print (df)
   A  B  C  D
0  1  4  7  1
1  2  5  8  3
2  2  6  9  5
3  1  4  1  9

df = df.groupby('A').agg([max, min])

df.columns = df.columns.map('_'.join)
print (df)
   B_max  B_min  C_max  C_min  D_max  D_min
A                                          
1      4      4      7      1      9      1
2      6      5      9      8      5      3

print (['_'.join(col) for col in df.columns])
['B_max', 'B_min', 'C_max', 'C_min', 'D_max', 'D_min']

df.columns = ['_'.join(col) for col in df.columns]
print (df)
   B_max  B_min  C_max  C_min  D_max  D_min
A                                          
1      4      4      7      1      9      1
2      6      5      9      8      5      3

If need prefix simple swap items of tuples:

df.columns = ['_'.join((col[1], col[0])) for col in df.columns]
print (df)
   max_B  min_B  max_C  min_C  max_D  min_D
A                                          
1      4      4      7      1      9      1
2      6      5      9      8      5      3

Another solution:

df.columns = ['{}_{}'.format(i[1], i[0]) for i in df.columns]
print (df)
   max_B  min_B  max_C  min_C  max_D  min_D
A                                          
1      4      4      7      1      9      1
2      6      5      9      8      5      3

If len of columns is big (10^6), then rather use to_series and str.join:

df.columns = df.columns.to_series().str.join('_')
like image 108
jezrael Avatar answered Nov 16 '22 02:11

jezrael