How to keep rows with the same values in two variables in r?

Question

I have a dataset with several variables, but I want to keep the rows that are the same based on two columns. Here is an example of what I want to do:

a <- c(rep('A',3), rep('B', 3), rep('C',3))
b <- c(1,1,2,4,4,4,5,5,5)
df <- data.frame(a,b)

  a b
1 A 1
2 A 1
3 A 2
4 B 4
5 B 4
6 B 4
7 C 5
8 C 5
9 C 5

I know that if I use the duplicated function I can get:

df[!duplicated(df),]

  a b
1 A 1
3 A 2
4 B 4
7 C 5

But since the level 'A' on column a does not have a unique value in b, I want to drop both observations to get a new data.frame as this:

  a b
4 B 4
7 C 5

I don't mind to have repeated values across b, as long as for every same level on a there is the same value in b.

Is there a way to do this? Thanks!

989 · Accepted Answer

This one maybe?

ag <- aggregate(b~a, df, unique)
ag[lengths(ag$b)==1,]

#  a b
#2 B 4
#3 C 5

How to keep rows with the same values in two variables in r?

Tags:

r

Ulises

1 Answers

989

Recent Activity

Donate For Us

How to keep rows with the same values in two variables in r?

Tags:

r

Ulises

1 Answers

989

Related questions

Recent Activity

Donate For Us