Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas pivot table rename columns

How to rename columns with multiple levels after pandas pivot operation?

Here's some code to generate test data:

import pandas as pd
df = pd.DataFrame({
    'c0': ['A','A','B','C'],
    'c01': ['A','A1','B','C'],
    'c02': ['b','b','d','c'],
    'v1': [1, 3,4,5],
    'v2': [1, 3,4,5]})

print(df)

gives a test dataframe:

   c0 c01 c02  v1  v2
0  A   A   b   1   1
1  A  A1   b   3   3
2  B   B   d   4   4
3  C   C   c   5   5

applying pivot

df2 = pd.pivot_table(df, index=["c0"], columns=["c01","c02"], values=["v1","v2"])
df2 = df2.reset_index()

gives

output1

how to rename the columns by joining levels? with format <c01 value>_<c02 value>_<v1>

for example first column should look like "A_b_v1"

The order of joining levels isn't really important to me.

like image 459
muon Avatar asked Feb 07 '17 20:02

muon


People also ask

How do I rename a column in a pivot table after pandas?

Sometimes we want to rename columns and indexes in the Pandas DataFrame object. We can use pandas DataFrame rename() function to rename columns and indexes.

How do I rename columns in a pivot table?

PivotTable report Click the field or item that you want to rename. On the Options tab, in the Active Field group, click the Active Field text box. Type a new name. Press ENTER.

How do I rename a column in pandas?

One way of renaming the columns in a Pandas Dataframe is by using the rename() function. This method is quite useful when we need to rename some selected columns because we need to specify information only for the columns which are to be renamed.


1 Answers

If you want to coalesce the multi-index into a single string index without caring about the index level order, you can simply map a join function over the columns, and assign the result list back:

df2.columns = list(map("_".join, df2.columns))

And for your question, you can loop through the columns where each element is a tuple, unpack the tuple and join them back in the order you want:

df2 = pd.pivot_table(df, index=["c0"], columns=["c01","c02"], values=["v1","v2"])

# Use the list comprehension to make a list of new column names and assign it back
# to the DataFrame columns attribute.
df2.columns = ["_".join((j,k,i)) for i,j,k in df2.columns]
df2.reset_index()

enter image description here

like image 99
Psidom Avatar answered Sep 22 '22 11:09

Psidom