Can I group by all columns except one using data.table
? I have a lot of columns, so I'd rather avoid writing out all the colnames
.
The reason being I'd like to collapse duplicates in a table, where I know one column has no relevance.
library(data.table)
DT <- structure(list(N = c(1, 2, 2), val = c(50, 60, 60), collapse = c("A",
"B", "C")), .Names = c("N", "val", "collapse"), row.names = c(NA,
-3L), class = c("data.table", "data.frame"))
> DT
N val collapse
1: 1 50 A
2: 2 60 B
3: 2 60 C
That is, given DT
, is there something like like DT[, print(.SD), by = !collapse]
which gives:
> DT[, print(.SD), .(N, val)]
collapse
1: A
collapse
1: B
2: C
without actually having to specify .(N, val)
? I realise I can do this by copy and pasting the column names, but I thought there might be some elegant way to do this too.
To group by all columns except one, you can use:
by = setdiff(names(DT), "collapse")
Explanation: setdiff
takes the general form of setdiff(x, y)
which returns all values of x
that are not in y
. In this case it means that all columnnames are returned except the collapse
-column.
Two alternatives:
# with '%in%'
names(dt1)[!names(dt1) %in% 'colB']
# with 'is.element'
names(dt1)[!is.element(names(dt1), 'colB')]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With