Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas groupby with bin counts

I have a DataFrame that looks like this:

+----------+---------+-------+ | username | post_id | views | +----------+---------+-------+ | john     |       1 |     3 | | john     |       2 |    23 | | john     |       3 |    44 | | john     |       4 |    82 | | jane     |       7 |     5 | | jane     |       8 |    25 | | jane     |       9 |    46 | | jane     |      10 |    56 | +----------+---------+-------+ 

and I would like to transform it to count views that belong to certain bins like this:

+------+------+-------+-------+--------+ |      | 1-10 | 11-25 | 25-50 | 51-100 | +------+------+-------+-------+--------+ | john |    1 |     1 |     1 |      1 | | jane |    1 |     1 |     1 |      1 | +------+------+-------+-------+--------+ 

I tried:

bins = [1, 10, 25, 50, 100] groups = df.groupby(pd.cut(df.views, bins)) groups.username.count() 

But it only gives aggregate counts and not counts by user. How can I get bin counts by user?

The aggregate counts (using my real data) looks like this:

impressions (2500, 5000]         2332 (5000, 10000]        1118 (10000, 50000]        570 (50000, 10000000]      14 Name: username, dtype: int64 
like image 648
metersk Avatar asked Dec 16 '15 16:12

metersk


1 Answers

You could group by both the bins and username, compute the group sizes and then use unstack():

>>> groups = df.groupby(['username', pd.cut(df.views, bins)]) >>> groups.size().unstack() views     (1, 10]  (10, 25]  (25, 50]  (50, 100] username jane            1         1         1          1 john            1         1         1          1 
like image 156
Alex Riley Avatar answered Sep 22 '22 04:09

Alex Riley