Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to keep rows with the same values in two variables in r?

Tags:

r

I have a dataset with several variables, but I want to keep the rows that are the same based on two columns. Here is an example of what I want to do:

a <- c(rep('A',3), rep('B', 3), rep('C',3))
b <- c(1,1,2,4,4,4,5,5,5)
df <- data.frame(a,b)

  a b
1 A 1
2 A 1
3 A 2
4 B 4
5 B 4
6 B 4
7 C 5
8 C 5
9 C 5

I know that if I use the duplicated function I can get:

df[!duplicated(df),]

  a b
1 A 1
3 A 2
4 B 4
7 C 5

But since the level 'A' on column a does not have a unique value in b, I want to drop both observations to get a new data.frame as this:

  a b
4 B 4
7 C 5

I don't mind to have repeated values across b, as long as for every same level on a there is the same value in b.

Is there a way to do this? Thanks!

like image 912
Ulises Avatar asked Jan 30 '26 01:01

Ulises


1 Answers

This one maybe?

ag <- aggregate(b~a, df, unique)
ag[lengths(ag$b)==1,]

#  a b
#2 B 4
#3 C 5
like image 181
989 Avatar answered Jan 31 '26 22:01

989



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!