If I group-by using the by
keyword in data.table it always returns the by
column as the first column. Is there a flag/option to tell it not to do this? Or a clever way of getting rid of it?
In particular I want to group and then rbindlist
to my original table, so infact the issue could also be said as - "how to stop it reordering the columns"
For example:
DT = data.table(I = as.numeric(1:6), N = rnorm(6), L = rep(c("a", "b", "c"), 2))
DT[, list(I = mean(I), N = mean(N)), by= L]
DT
gives:
> DT[, list(I = mean(I), N = mean(N)), by= L]
L I N
1: a 2.5 0.4291802
2: b 3.5 0.6669517
3: c 4.5 -0.6471886
> DT
I N L
1: 1 1.8460998 a
2: 2 0.7093438 b
3: 3 -1.7991193 c
4: 4 -0.9877394 a
5: 5 0.6245596 b
6: 6 0.5047421 c
As far as the rbindlist
request goes, it would be nice to be able to do this:
DT = rbindlist(list(DT, DT[, list(I = mean(I), N = mean(N)), by= L]))
or perhaps
DT = rbindlist(list(DT, DT[, list(I = mean(I), N = mean(N), L), by= L]))
or something similar (neither of which work)
I don't particularly like this auto-column-reordering as well. The "trick" I usually do is to use setcolorder
after obtaining the output as follows:
DT <- data.table(I = 1:6, N = rnorm(6), L = rep(c("a", "b", "c"), 2))
DT.out <- DT[, list(I = mean(I), N = mean(N)), by= L]
Here, setcolorder
as:
setcolorder(DT.out, names(DT))
# I N L
# 1: 2.5 0.772719306 a
# 2: 3.5 -0.008921738 b
# 3: 4.5 -0.770807996 c
Of course this works if the names of DT
are the same as DT.out
. Else, you'll have to specify the column order explicitly as:
setcolorder(DT.out, c("I", "N", "L"))
Edit: Since you'd like to bind them by rows immediately, yes, it would be nice to not have this as an intermediate result. And since rbindlist
seems to bind by position, you could use rbind
which binds by column names and data.table
tells this as a warning and suggests to use use.names=F
if you want to instead bind by position. You can safely ignore this warning.
dt1 <- data.table(x=1:5, y=6:10)
dt2 <- data.table(y=1:5, x=6:10)
rbind(dt1, dt2) # or do.call(rbind, list(dt1, dt2))
# x y
# 1: 1 6
# 2: 2 7
# 3: 3 8
# 4: 4 9
# 5: 5 10
# 6: 6 1
# 7: 7 2
# 8: 8 3
# 9: 9 4
# 10: 10 5
# Warning message:
# In .rbind.data.table(...) :
# Argument 2 has names in a different order. Columns will be bound by name for
# consistency with base. Alternatively, you can drop names (by using an unnamed
# list) and the columns will then be joined by position. Or, set use.names=FALSE.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With