Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple duplicates (2 times, 3 times,...) in R

After searching for a while, I know that this question has not been answered yet. Assume that I have the following vector

v <- c("a", "b", "b", "c","c","c", "d", "d", "d", "d")

How do I find those values having more than 1 duplicates

(should be "c","c","c", "d", "d", "d", "d")

and more than 2 duplicates

(should be "d", "d", "d", "d")

Function duplicated(v) only returns values having duplicates.

like image 380
Duy Bui Avatar asked Apr 30 '15 16:04

Duy Bui


People also ask

How do I find repetition in R?

Find and drop duplicate elementsThe R function duplicated() returns a logical vector where TRUE specifies which elements of a vector or data frame are duplicates. ! is a logical negation. ! duplicated() means that we don't want duplicate rows.

How do I subset duplicates in R?

We can find the rows with duplicated values in a particular column of an R data frame by using duplicated function inside the subset function. This will return only the duplicate rows based on the column we choose that means the first unique value will not be in the output.

What is duplicated function in R?

duplicated() determines which elements of a vector or data frame are duplicates of elements with smaller subscripts, and returns a logical vector indicating which elements (rows) are duplicates.


1 Answers

You can generate a table() and then check which elements of v are part of the relevant subset of the table, e.g.

R> v <- c("a", "b", "b", "c","c","c", "d", "d", "d", "d")
R> tab <- table(v)
R> tab
v
a b c d 
1 2 3 4 
R> v[v %in% names(tab[tab > 2])]
[1] "c" "c" "c" "d" "d" "d" "d"
R> v[v %in% names(tab[tab > 3])]
[1] "d" "d" "d" "d"
like image 52
Achim Zeileis Avatar answered Sep 21 '22 17:09

Achim Zeileis