Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I use variables newly created in `j` in the same `j` argument?

Tags:

r

data.table

In the j argument in data.table, is there syntax allowing me to reference previously created variables while in the same j statement? I'm thinking of something like Lisp's let* construct.

library(data.table)
set.seed(22)
DT <- data.table(a = rep(1:5, each = 10),
                 b = sample(c(0,1), 50, rep = TRUE))

DT[ ,
   list(attempts = .N,
        successes = sum(b),
        rate = successes / attempts),
   by = a]

This results in

# Error in `[.data.table`(DT, , list(attempts = .N, successes = sum(b),  : 
#  object 'successes' not found

I understand why, but is there a different way to accomplish this in the same j?

like image 653
Erik Iverson Avatar asked May 16 '13 16:05

Erik Iverson


Video Answer


1 Answers

This will do the trick:

DT[ , {
    list(attempts = attempts <- .N,
         successes = successes <- sum(b),
         rate = successes/attempts)
    },  by = a]
#    a attempts successes rate
# 1: 1       10         5  0.5
# 2: 2       10         6  0.6
# 3: 3       10         3  0.3
# 4: 4       10         5  0.5
# 5: 5       10         5  0.5

FWIW, this closely related data.table feature request would make possible +/- the syntax used in your question. Quoting from the linked page:

Summary:

Iterative RHS of := (and `:=`(...)), and multiple := inside j = {...} syntax

Detailed description

e.g. DT[, `:=`( m1 = mean(a), m2 = sd(a), s = m1/m2 ), by = group]

where s can use previous lhs names ( using the word 'iterative' tries to convey that ).

like image 50
Josh O'Brien Avatar answered Sep 27 '22 00:09

Josh O'Brien