Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

data.table group by with a filter condition [duplicate]

Tags:

r

data.table

Given a data.table like the one below, I would like to create a new column which is the value summed by region, and where period == 0.

   region period value
1:     US      0    10
2:     US      1    11
3:  Japan      0    12
4:  Japan      1    13

The typical data.frame approach to this would be to create a separate frame, filter by period, sum by region, and then merge back with the original frame with region as the key. I'm wondering if there is a one-line approach in data.table using the data.table groupby synytax. The closest I've come is the two-line approach below.

x1 <- data.table(
  region=c("US","US","Japan","Japan"),
  period=c(0,1,0,1),
  value=10:13)

x1[period==0,value0:=sum(value),by=region]
x1[,value0:=min(value0,na.rm=T),by=region]


   region period value value0
1:     US      0    10     10
2:     US      1    11     10
3:  Japan      0    12     12
4:  Japan      1    13     12
like image 308
Abiel Avatar asked Feb 14 '23 15:02

Abiel


1 Answers

I'm not sure if this is the "data.table" way, but you can try something like this:

x1[, value0 := sum(value[period == 0]), by = region]
x1
#    region period value value0
# 1:     US      0    10     10
# 2:     US      1    11     10
# 3:  Japan      0    12     12
# 4:  Japan      1    13     12
like image 76
A5C1D2H2I1M1N2O1R2T1 Avatar answered Mar 03 '23 15:03

A5C1D2H2I1M1N2O1R2T1