Use by = each row for data table

Question

I have a data table and I am trying to create a new variable that is a function of all the other columns. A simplified example would be if I simply wanted to sum or take an average across all the rows. For example:

dt <- data.table(a = 1:9, b = seq(10,90,10), c = seq(11:19), d = seq(100, 900, 100))

I want to create a vector/column that is simply the average of all the columns. The syntax that I think of would look something like this:

dt[, average := mean(.SD)]

However, this sums the whole thing. I know I can also do:

dt[, average := lapply(.SD, mean)]

But this gives a single row result. I'm essentially looking for the equivalent of:

dt[, average := lapply(.SD, mean), by = all]

such that it simply calculates this for all the rows, without having to create an "id" column and doing all of my calculating by that column. Is this possible?

lmo · Accepted Answer

The following data.table code worked for me.

 dt[, average := rowMeans(.SD)]

As pointed out by @jangorecki, it is possible to construct your own function to run by row as long as you remember that each row is a list object:

# my function, must unlist the argument
myMean <- function(i, ...) mean(unlist(i), ...)

using by=seq_len

dt[, averageNew := myMean(.SD), by = seq_len(nrow(dt))]

using row.names

dt[, averageOther := myMean(.SD), by = row.names(dt)]

Use by = each row for data table

Tags:

r

data.table

Brandon

1 Answers

lmo

Recent Activity

Donate For Us

Use by = each row for data table

Tags:

r

data.table

Brandon

1 Answers

lmo

Related questions

Recent Activity

Donate For Us