Given a vector:
eg.:
a = c(1, 2, 2, 4, 5, 3, 5, 3, 2, 1, 5, 3)
Using a[a%in%a[duplicated(a)]]
I can remove values not duplicated. However, it only works for values that are only present once.
How would I go on about removing all values that aren't present in this thrice? (or more, in other situations)
The expected result would be:
2 2 5 3 5 3 2 5 3
with 1 and 4 removed, as they are only present twice and once
You can do this in one line with the ave
function:
a[ave(a, a, FUN=length) >= 3]
# [1] 2 2 5 3 5 3 2 5 3
The call to ave(a, a, FUN=length)
returns, for each element a[i]
in vector a
, the total number of times a[i]
appears within a
. Then you can subset a
, limiting to the indices where the total number of times is 3 or more.
Reasonably straightforward (longer than using ave
but possibly more comprehensible):
x <- c(1,2,2,4,5,3,5,3,2,1,5,3)
tt <- table(x) ## tabulate
## find relevant values
ttr <- as.numeric(names(tt)[tt>=3])
x[x %in% ttr] ## subset
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With