Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

group by count when count is zero in r

Tags:

r

aggregate

I use aggregate function to get count by group. The aggregate function only returns count for groups if count > 0. This is what I have

dt <- data.frame(
n  = c(1,2,3,4,5,6),
id = c('A','A','A','B','B','B'),
group = c("x","x","y","x","x","x")) 

applying the aggregate function

my.count <- aggregate(n ~ id+group, dt, length)

now see the results

my.count[order(my.count$id),]

I get following

id group   n
1  A     x 2
3  A     y 1
2  B     x 3

I need the following (the last row has zero that i need)

id group   n
1  A     x 2
3  A     y 1
2  B     x 3
4  B     y 0

thanks for you help in in advance

like image 393
seakyourpeak Avatar asked Feb 08 '23 06:02

seakyourpeak


2 Answers

We can create another column 'ind' and then use dcast to reshape from 'long' to 'wide', specifying the fun.aggregate as length and drop=FALSE.

library(reshape2)
dcast(transform(dt, ind='n'), id+group~ind,
           value.var='n', length, drop=FALSE)
#  id group n
#1  A     x 2
#2  A     y 1
#3  B     x 3
#4  B     y 0

Or a base R option is

 as.data.frame(table(dt[-1]))
like image 62
akrun Avatar answered Feb 16 '23 03:02

akrun


You can merge your "my.count" object with the complete set of "id" and "group" columns:

merge(my.count, expand.grid(lapply(dt[c("id", "group")], unique)), all = TRUE)
##   id group  n
## 1  A     x  2
## 2  A     y  1
## 3  B     x  3
## 4  B     y NA

There are several questions on SO that show you how to replace NA with 0 if that is required.

like image 40
A5C1D2H2I1M1N2O1R2T1 Avatar answered Feb 16 '23 03:02

A5C1D2H2I1M1N2O1R2T1