Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

filter rows in data.table with `by`

Tags:

r

data.table

I would like to filter group which following criteria. The DT brings unexpected results.

Input data

library(data.table)
library(dplyr)

dt <- data.table(
    logic = c(TRUE, TRUE, FALSE, TRUE, TRUE, TRUE),
    group = c("A" , "A",  "A"  , "B" , "B" , "B")
)

I would like to filter group, where logic field values are all TRUE.

Expected behavior (by dplyr)

As you can see dplyr works as expected, and brings back only values with group = B

dt %>% 
  group_by(group) %>% 
  filter(all(logic))
# Source: local data table [3 x 2]
# Groups: group

#   logic group
# 1  TRUE     B
# 2  TRUE     B
# 3  TRUE     B

Unexpected behavior by data.table

DT doesn't really filter rows, either bringing all table or nothing.

dt[all(logic), group, by = group]
# Empty data.table (0 rows) of 2 cols: group,group

dt[all(.SD$logic), group,by = group]
#    group group
# 1:     A     A
# 2:     B     B
like image 729
Cron Merdek Avatar asked Dec 21 '15 10:12

Cron Merdek


2 Answers

You could use [ as in

dt[, .SD[all(logic)], by = group]
#   group logic
#1:     B  TRUE
#2:     B  TRUE
#3:     B  TRUE
like image 100
talat Avatar answered Sep 26 '22 00:09

talat


We need to use if

dt[, if(all(logic)) .SD, by = group]
#    group logic
#1:     B  TRUE
#2:     B  TRUE
#3:     B  TRUE
like image 34
akrun Avatar answered Sep 24 '22 00:09

akrun