Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Printing a double group by pandas dataframe as a 2D array

Tags:

python

pandas

I want to display the result of a single value aggregation with 2 group by's into a table.

Such that

df.groupby(['colA', 'colB']).size

Would yield:

   B1  B2  B3  B4 
A1 s11 s12 s13 ..
A2 s21 s22 s23 ..
A3 s31 s32 s33 ..
A4 ..  ..  ..  s44

What's a quick and easy way of doing this?

EDIT: here's an example. I have the logins of all users, and I want to display the number of logins (=rows) for each user and day

Day,User
1,John
1,John
1,Ben
1,Sarah
2,Ben
2,Sarah
2,Sarah

Should yield:

D\U John  Ben  Sarah
1   2     1    1
2   0     1    2
like image 856
bsuire Avatar asked Nov 18 '16 17:11

bsuire


1 Answers

Use:

df.groupby(['colA', 'colB']).size().unstack()

Example:

df = pd.DataFrame(np.transpose([np.random.choice(['B1','B2','B3'], size=10),
                           np.random.choice(['A1','A2','A3'], size=10)]),
            columns=['A','B'])

df

    A   B
0   B3  A1
1   B1  A2
2   B3  A3
3   B1  A3
4   B2  A2
5   B3  A3
6   B3  A1
7   B2  A1
8   B1  A3
9   B3  A3

Now:

df.groupby(['A','B']).size().unstack()

B   A1  A2  A3
A           
B1  NaN     1.0     2.0
B2  1.0     1.0     NaN
B3  2.0     NaN     3.0

Update now that your post has data:

df.groupby(['Day','User']).size().unstack().fillna(0)

User  Ben  John  Sarah
Day                   
1     1.0   2.0    1.0
2     1.0   0.0    2.0
like image 108
Julien Marrec Avatar answered Sep 21 '22 08:09

Julien Marrec