I want to pivot a pandas dataframe without aggregation, and instead of presenting the pivot index column vertically I want to present it horizontally. I tried with pd.pivot_table
but I'm not getting exactly what I wanted.
data = {'year': [2011, 2011, 2012, 2013, 2013],
'A': [10, 21, 20, 10, 39],
'B': [12, 45, 19, 10, 39]}
df = pd.DataFrame(data)
print df
A B year
0 10 12 2011
1 21 45 2011
2 20 19 2012
3 10 10 2013
4 39 39 2013
But I want to have:
year 2011 2012 2013
cols A B A B A B
0 10 12 20 19 10 10
1 21 45 NaN NaN 39 39
Inside the Pivot Column dialog, select the column with the values that will populate the new columns to be created. In this case "Time" but could be any field type, including text. In the Advanced Options part, select "Don´t Aggregate" so the values will displayed without any modification.
The answer is no: PIVOT requires aggregation.
The process of stacking pivots a level of column labels to the row index. Unstacking performs the opposite, pivoting a level of the row index into the column index.
What is the difference between the pivot_table and the groupby? The groupby method is generally enough for two-dimensional operations, but pivot_table is used for multi-dimensional grouping operations.
groupby('year')
so I can reset_index
to get index values of 0
and 1
. Then do a bunch of clean up.
df.groupby('year')['A', 'B'] \
.apply(lambda df: df.reset_index(drop=True)) \
.unstack(0).swaplevel(0, 1, 1).sort_index(1)
You can first create column for new index by cumcount
, then stack
with unstack
:
df['g'] = df.groupby('year')['year'].cumcount()
df1 = df.set_index(['g','year']).stack().unstack([1,2])
print (df1)
year 2011 2012 2013
A B A B A B
g
0 10.0 12.0 20.0 19.0 10.0 10.0
1 21.0 45.0 NaN NaN 39.0 39.0
If need set columns names use rename_axis
(new in pandas
0.18.0
):
df['g'] = df.groupby('year')['year'].cumcount()
df1 = df.set_index(['g','year'])
.stack()
.unstack([1,2])
.rename_axis(None)
.rename_axis(('year','cols'), axis=1)
print (df1)
year 2011 2012 2013
cols A B A B A B
0 10.0 12.0 20.0 19.0 10.0 10.0
1 21.0 45.0 NaN NaN 39.0 39.0
Another solution with pivot
, but you need swap first and second level of Multiindex
in columns by swaplevel
and then sort it by sort_index
:
df['g'] = df.groupby('year')['year'].cumcount()
df1 = df.pivot(index='g', columns='year')
df1 = df1.swaplevel(0,1, axis=1).sort_index(axis=1)
print (df1)
year 2011 2012 2013
A B A B A B
g
0 10.0 12.0 20.0 19.0 10.0 10.0
1 21.0 45.0 NaN NaN 39.0 39.0
print (df1)
year 2011 2012 2013
A B A B A B
g
0 10.0 12.0 20.0 19.0 10.0 10.0
1 21.0 45.0 NaN NaN 39.0 39.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With