Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

simple pivot table of pandas dataframe

I'm trying to do a seemingly very simple task. Given a dataframe:

daf = pd.DataFrame({'co':['g','r','b','r','g','r','b','g'], 'sh':['c','s','r','r','r','s','c','r']})

    co  sh
0   g   c 
1   r   s 
2   b   r 
3   r   r 
4   g   r 
5   r   s
6   b   c
7   g   r 

I'd like to count the number of records with the unique combination of 'co' and 'sh' values and output as a table with rows ['g','r','b'] and columns ['c','s','r']

    c   s   r
g   1   0   2
r   0   1   1
b   1   0   1

Can it be done using pivot_table?

Thank you,

like image 687
David Makovoz Avatar asked Jan 02 '15 06:01

David Makovoz


1 Answers

It can be done more simply using pandas.crosstab:

>>> pandas.crosstab(d.co, d.sh)
sh  c  r  s
co         
b   1  1  0
g   1  2  0
r   0  1  2

You can do it with pivot_table, but it will give you NaN instead of 0 for missing combos. You need to specify len as the aggregating function:

>>> d.pivot_table(index='co', columns='sh', aggfunc=len)
sh   c  r   s
co           
b    1  1 NaN
g    1  2 NaN
r  NaN  1   2
like image 107
BrenBarn Avatar answered Nov 02 '22 17:11

BrenBarn