Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count rows in data table with certain values by group

Tags:

r

data.table

I have a a data table that somewhat looks like this:

Property    Type
1           apartment
1           office
2           office
2           office
3           apartment
3           apartment
3           office

I now want to count offices and apartments by property:

Property    Type       number_of_offices    number_of_apartments
       1    apartment                  1                       1
       1    office                     1                       1
       2    office                     2                       0
       2    office                     2                       0
       3    apartment                  1                       2
       3    apartment                  1                       2
       3    office                     1                       2

I tried

my.DT <- myDT[,.(Type=Type, number_of_offices=nrow(my.DT[my.DT$Type=="office",]), number_of_apartments=nrow(my.DT$Type=="apparment",], by="Property")

However, this only gives me the total counts for the whole data table. Does anyone have a solution?

like image 263
laser.p Avatar asked Mar 12 '20 13:03

laser.p


People also ask

How do I count grouped data in R?

group_by() function along with n() is used to count the number of occurrences of the group in R. group_by() function takes “State” and “Name” column as argument and groups by these two columns and summarise() uses n() function to find count of a sales.

Which function counts the number of rows in a group?

The COUNT(*) function returns the number of rows in a table, including the rows including NULL and duplicates.

How do I count the number of rows in each column?

If you need a quick way to count rows that contain data, select all the cells in the first column of that data (it may not be column A). Just click the column header. The status bar, in the lower-right corner of your Excel window, will tell you the row count.


2 Answers

You can solve it as follows:

cols <- c("number_of_offices", "number_of_apartments")
df[, (cols) := .(sum(Type == "office"), sum(Type == "apartment")), Property]

# Property      Type number_of_offices number_of_apartments
# 1:        1 apartment                 1                    1
# 2:        1    office                 1                    1
# 3:        2    office                 2                    0
# 4:        2    office                 2                    0
# 5:        3 apartment                 1                    2
# 6:        3 apartment                 1                    2
# 7:        3    office                 1                    2
like image 132
B. Christian Kamgang Avatar answered Oct 19 '22 07:10

B. Christian Kamgang


Is there a particular reason why you want to merge the grouped counts with my.df?

You can try this, which will give you the counts grouped by Property and Type. Then merge with the original myDT:

grouped = myDT[, .N, by=c('Property','Type')]
myDT = merge(myDT, grouped[Type == 'apartment', list(Property,N)], by='Property', all.x=TRUE)
myDT = merge(myDT, grouped[Type == 'office', list(Property,N)], by='Property', all.x=TRUE)
setnames(myDT, c('N.x','N.y'), c('Number of appartments','Number of offices'))
myDT[is.na(myDT)] <- 0

> myDT
   Property      Type Number of appartments Number of offices
1:        1 apartment                     1                 1
2:        1    office                     1                 1
3:        2    office                     0                 2
4:        2    office                     0                 2
5:        3 apartment                     2                 1
6:        3 apartment                     2                 1
7:        3    office                     2                 1
like image 44
Arturo Sbr Avatar answered Oct 19 '22 09:10

Arturo Sbr