I'm trying to identify all rows which are represented twice or more in a matrix.
For example:
m <- matrix(c(1,2,1,3,1,4,1,2,2,3,2,3,1,2,5), ncol = 3)
m
duplicated(m[,1])
Outputs:
[,1] [,2] [,3]
[1,] 1 4 2
[2,] 2 1 3
[3,] 1 2 1
[4,] 3 2 2
[5,] 1 3 5
[1] FALSE FALSE TRUE FALSE TRUE
However, I do not want that output. I want:
[1] TRUE FALSE TRUE FALSE TRUE
since row[1,1]'s value appears 3 times in m's column 1.
When I saw this question I asked myself "what would Jim Holtman or Bill Dunlap advise on Rhelp?". Haven't looked in the archives, but I think they might have advised using two "parallel" applications of duplicated
, one with the defaults and one with the fromLast
parameter and conjoining with a vector OR (|
) operator.
duplicated(m[,1]) | duplicated(m[,1], fromLast=TRUE)
[1] TRUE FALSE TRUE FALSE TRUE
Here's one approach of many:
m <- matrix(c(1,2,1,3,1,4,1,2,2,3,2,3,1,2,5), ncol = 3)
x <- table(m[,1])
as.character(m[,1]) %in% names(x)[x > 1]
## > as.character(m[,1]) %in% names(x)[x > 1]
## [1] TRUE FALSE TRUE FALSE TRUE
# or wrap it up as function:
FUN <- function(vec) {
x <- table(vec)
as.character(vec) %in% names(x)[x > 1]
}
FUN(m[, 1])
## > FUN(m[, 1])
## [1] TRUE FALSE TRUE FALSE TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With