Difficult to formulate my question (hence difficult to search efficiently in the archives...). The code below should be clear.
Why in the second command the last column name of the output is not "nb_ob" ? c(lapply(.SD, mean), nb_obs = .N)
should provide a named list with 4 elements that should become a column in the final result.
Curiously c(lapply(.SD[,1:4], mean), nb_obs = .N)
(third command) provides the intended result. If I remove the by
argument (last command) I also obtain the expected column name (with a warning for the character column "Species").
Code run with data.table_1.10.4, R version 3.4.1 for Ubuntu 16.04.3 LTS (I can provide more if needed)
iris <- data.table(iris)
iris[, c(lapply(.SD, mean), nb_obs = .N), by = Species] # 2nd command
# Species Sepal.Length Sepal.Width Petal.Length Petal.Width N
# 1: setosa 5.006 3.428 1.462 0.246 50
# 2: versicolor 5.936 2.770 4.260 1.326 50
# 3: virginica 6.588 2.974 5.552 2.026 50
iris[, c(lapply(.SD[,1:4], mean), nb_obs = .N), by = Species] # 3rd command
# Species Sepal.Length Sepal.Width Petal.Length Petal.Width nb_obs
# 1: setosa 5.006 3.428 1.462 0.246 50
# 2: versicolor 5.936 2.770 4.260 1.326 50
# 3: virginica 6.588 2.974 5.552 2.026 50
iris[, c(lapply(.SD, mean), nb_obs = .N)] # Fourth command
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species nb_obs
# 1: 5.843333 3.057333 3.758 1.199333 NA 150
This is an issue in data.table
project backlog raised based on this question:
This answer is based on @Frank's comment above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With