I have a input vector vi with boolean values. I want to take a random sample of size n from the vector where the value is true, so the final vector vf has these properties
The lengths of the vectors are equal
length(vf) == length(v0)
vf has n true values
n==sum(vf)
The true values in vf cannot be more than those in v0
n <= sum(v0)
All the true values in vf are also true in vi
The vectors represents a selection of rows in a data frame, and this implements a stratified sample. So far I figured out how to use which() to get the row numbers, to use sample() to get a random sample, but the last part is recreating the boolean vector. There is probably a more elegant way?
For example:
n <- 1
v0 <- c(T,T,F,F)
vf <- c(T,F,F,F)
Here's one solution:
# Make up some vector v0 and choose n
v0 <- rep(c(F,T,F), 5)
n <- 3
# The actual code
x <- which(v0)
vf <- logical(length(v0))
vf[x[sample.int(length(x), n)]] <- TRUE
# Finally validate the result
identical(length(vf), length(v0)) # TRUE
all(v0[vf]) # TRUE
sum(vf) == n # TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With