Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: show ALL rows with duplicated elements in a column [duplicate]

Tags:

r

dplyr

Does a function like this exist in any package?

isdup <- function (x) duplicated (x) | duplicated (x, fromLast = TRUE)

My intention is to use it with dplyr to display all rows with duplicated values in a given column. I need the first occurrence of the duplicated element to be shown as well.

In this data.frame for instance

dat <- as.data.frame (list (l = c ("A", "A", "B", "C"), n = 1:4))
dat

> dat
  l n
1 A 1
2 A 2
3 B 3
4 C 4

I would like to display the rows where column l is duplicated ie. those with an A value doing:

library (dplyr)
dat %>% filter (isdup (l))

returns

  l n
1 A 1
2 A 2
like image 972
dmontaner Avatar asked May 20 '16 17:05

dmontaner


People also ask

How do I find repeated rows in R?

We can find the rows with duplicated values in a particular column of an R data frame by using duplicated function inside the subset function. This will return only the duplicate rows based on the column we choose that means the first unique value will not be in the output.

How do you filter out duplicates in a column in R?

distinct() function can be used to filter out the duplicate rows. We just have to pass our R object and the column name as an argument in the distinct() function.


1 Answers

dat %>% group_by(l) %>% filter(n() > 1)

I don't know if it exists in any package, but since you can implement it easily, I'd say just go ahead and implement it yourself.

like image 104
Nick Larsen Avatar answered Sep 20 '22 13:09

Nick Larsen