Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

get rows of unique values by group

Tags:

r

data.table

I have a data.table and want to pick those lines of the data.table where some values of a variable x are unique relative to another variable y

It's possible to get the unique values of x, grouped by y in a separate dataset, like this

dt[,unique(x),by=y]

But I want to pick the rows in the original dataset where this is the case. I don't want a new data.table because I also need the other variables.

So, what do I have to add to my code to get the rows in dt for which the above is true?

dt <- data.table(y=rep(letters[1:2],each=3),x=c(1,2,2,3,2,1),z=1:6) 

   y x z
1: a 1 1
2: a 2 2
3: a 2 3
4: b 3 4
5: b 2 5
6: b 1 6

What I want:

   y x z
1: a 1 1
2: a 2 2
3: b 3 4
4: b 2 5
5: b 1 6
like image 799
beginneR Avatar asked Aug 28 '13 07:08

beginneR


1 Answers

The idiomatic data.table way is:

require(data.table)
unique(dt, by = c("y", "x"))
#    y x z
# 1: a 1 1
# 2: a 2 2
# 3: b 3 4
# 4: b 2 5
# 5: b 1 6
like image 114
Arun Avatar answered Sep 19 '22 22:09

Arun