Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

average by group, removing current row

Tags:

r

data.table

I want to compute group means of a variable but excluding the focal respondent:

set.seed(1)
dat <- data.table(id = 1:30, y = runif(30), grp = rep(1:3, each=10))

The first record (respondent) should have an average of... the second... and so on:

mean(dat[c==1, y][-1])
mean(dat[c==1, y][-2])
mean(dat[c==1, y][-3])

For the second group the same:

mean(dat[c==2, y][-1])
mean(dat[c==2, y][-2])
mean(dat[c==2, y][-3])

I tried this, but it didn't work:

ex[, avg := mean(ex[, y][-.I]), by=grp]

Any ideas?

like image 893
sdaza Avatar asked Feb 10 '23 23:02

sdaza


2 Answers

You can try this solution:

set.seed(1)
dat <- data.table(id = 1:9, y = c(NA,runif(8)), grp = rep(1:3, each=3))

dat[, avg2 := sapply(seq_along(y),function(i) mean(y[-i],na.rm=T)), by=grp]

dat
#    id         y grp      avg2
# 1:  1        NA   1 0.3188163
# 2:  2 0.2655087   1 0.3721239
# 3:  3 0.3721239   1 0.2655087
# 4:  4 0.5728534   2 0.5549449
# 5:  5 0.9082078   2 0.3872676
# 6:  6 0.2016819   2 0.7405306
# 7:  7 0.8983897   3 0.8027365
# 8:  8 0.9446753   3 0.7795937
# 9:  9 0.6607978   3 0.9215325
like image 58
Marat Talipov Avatar answered Feb 19 '23 03:02

Marat Talipov


Seems like you're most of the way there and just need to account for NA's:

dat[, avg := (sum(y, na.rm=T) - ifelse(is.na(y), 0, y)) / (sum(!is.na(y)) + is.na(y) - 1)
    , by = grp]

No double loops or extra memory required.

like image 45
eddi Avatar answered Feb 19 '23 05:02

eddi