For a dataframe like this:
d = {'id': [1,1,1,2,2], 'Month':[1,2,3,1,3],'Value':[12,23,15,45,34], 'Cost':[124,214,1234,1324,234]} df = pd.DataFrame(d) Cost Month Value id 0 124 1 12 1 1 214 2 23 1 2 1234 3 15 1 3 1324 1 45 2 4 234 3 34 2
to which I apply pivot_table
df2 = pd.pivot_table(df, values=['Value','Cost'], index=['id'], columns=['Month'], aggfunc=np.sum, fill_value=0)
to get df2:
Cost Value Month 1 2 3 1 2 3 id 1 124 214 1234 12 23 15 2 1324 0 234 45 0 34
is there an easy way to format resulting dataframe column names like
id Cost1 Cost2 Cost3 Value1 Value2 Value3 1 124 214 1234 12 23 15 2 1324 0 234 45 0 34
If I do:
df2.columns =[s1 + str(s2) for (s1,s2) in df2.columns.tolist()]
I get:
Cost1 Cost2 Cost3 Value1 Value2 Value3 id 1 124 214 1234 12 23 15 2 1324 0 234 45 0 34
How to get rid of the extra level?
thanks!
You can get the column names from pandas DataFrame using df. columns. values , and pass this to python list() function to get it as list, once you have the data you can print it using print() statement.
Turn column and row field headers on or off Click the PivotTable. This displays the PivotTable Tools tab on the ribbon. To switch between showing and hiding field headers, on the Analyze or Options tab, in the Show group, click Field Headers.
Pandas DataFrame: pivot() functionThe pivot() function is used to reshaped a given DataFrame organized by given index / column values. This function does not support data aggregation, multiple values will result in a MultiIndex in the columns. Column to use to make new frame's index. If None, uses existing index.
You need to create a new list of your columns in the desired order, then use df = df[cols] to rearrange the columns in this new order.
Using clues from @chrisb's answer, this gave me exactly what I was after:
df2.reset_index(inplace=True)
which gives:
id Cost1 Cost2 Cost3 Value1 Value2 Value3 1 124 214 1234 12 23 15 2 1324 0 234 45 0 34
and in case of multiple index columns, this post explains it well. just to be complete, here is how:
df2.columns = [' '.join(col).strip() for col in df2.columns.values]
'id'
is the index name, which you can set to None
to remove.
In [35]: df2.index.name = None In [36]: df2 Out[36]: Cost1 Cost2 Cost3 Value1 Value2 Value3 1 124 214 1234 12 23 15 2 1324 0 234 45 0 34
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With