How do I get a data table in R to just return a column of grouped values where I am applying no other aggregate functions? Say I have:
test<-data.table(x=c(rep("a",2),rep("b",3)),y=1:5)
And I just want to return:
a
b
When I use:
test[,,by=x]
I get back:
x y
1: a 1
2: a 2
3: b 3
4: b 4
5: b 5
And when I do:
test[,x,by=x]
I get back:
x x
1: a a
2: b b
I know I can use:
test[,.(unique(x))]
But that doesn't seem like the right way to do it and besides what if I wanted to return two columns grouped?
I'd accomplish this by applying unique()
to a data.table
containing just the subset of grouping columns in which I was interested. Handing a data.table
to unique()
, as below, will trigger a call to unique.data.table()
, which works just as well for two or more columns as for one:
unique(test[, .(x)]) ## .() is data.table shorthand for list()
# x
# 1: a
# 2: b
## Add another column to see that unique.data.table() works fine in that case as well
test[, z:=c(1,1,1,2,2)]
unique(test[, .(x,z)])
# x z
# 1: a 1
# 2: b 1
# 3: b 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With