Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R repeat function until condition met

I am trying to generate a random sample that excludes certain "bad data." I do not know whether the data is "bad" until after I sample it. Thus, I need to make a random draw from the population and then test it. If the data is "good" then keep it. If the data is "bad" then randomly draw another and test it. I would like to do this until my sample size reaches 25. Below is a simplified example of my attempt to write a function that does this. Can anyone please tell me what I am missing?

df <- data.frame(NAME=c(rep('Frank',10),rep('Mary',10)), SCORE=rnorm(20))
df

random.sample <- function(x) {
  x <- df[sample(nrow(df), 1), ]
  if (x$SCORE > 0) return(x)
 #if (x$SCORE <= 0) run the function again
}

random.sample(df)
like image 899
user1491868 Avatar asked Dec 10 '13 23:12

user1491868


People also ask

How do you break a repeat loop in R?

repeat loop in R: A repeat loop is used to iterate over a block of code multiple number of times. There is no condition check in repeat loop to exit the loop. The only way to exit a repeat loop is to call break.

How do you use a repeat loop?

A repeat loop is used any time you want to execute one or more statements repeatedly some number of times. The statements to be repeated are preceded by one of the repeat statements described below, and must always be followed by an end repeat statement to mark the end of the loop. Repeat loops may be nested.


4 Answers

Here is a general use of a while loop:

random.sample <- function(x) {
  success <- FALSE
  while (!success) {
    # do something
    i <- sample(nrow(df), 1)
    x <- df[sample(nrow(df), 1), ]
    # check for success
    success <- x$SCORE > 0
  }
  return(x)
}

An alternative is to use repeat (syntactic sugar for while(TRUE)) and break:

random.sample <- function(x) {
  repeat {
    # do something
    i <- sample(nrow(df), 1)
    x <- df[sample(nrow(df), 1), ]
    # exit if the condition is met
    if (x$SCORE > 0) break
  }
  return(x)
}

where break makes you exit the repeat block. Alternatively, you could have if (x$SCORE > 0) return(x) to exit the function directly.

like image 116
flodel Avatar answered Oct 14 '22 09:10

flodel


use this after your first sample

while (any(bad <- (x$SCORE <= 0)))
   x[bad, ] <- df[sample(nrow(df), sum(bad)), ]
like image 32
Ricardo Saporta Avatar answered Oct 14 '22 08:10

Ricardo Saporta


You can just select the rows to sample directly like so (just 5):

> df <- data.frame(NAME=c(rep('Frank',10),rep('Mary',10)), SCORE=rnorm(20))
> df[sample(which(df$SCORE>0), 5),]


 NAME     SCORE
14  Mary 1.0858854
10 Frank 0.7037989
16  Mary 0.7688913
5  Frank 0.2067499
17  Mary 0.4391216

this is without replacement, for bootstrap put in replace=T.

like image 31
Stephen Henderson Avatar answered Oct 14 '22 10:10

Stephen Henderson


 random.sample <- function(x) {
   x <- df[sample(nrow(df), 1), ]
   if (x$SCORE > 0) return(x)
   Recall(x)# run the function again
 }

 random.sample(df)
#   NAME    SCORE
#14 Mary 1.252566

It seems to me that this should work as well:

 df$SCORE[ df$SCORE > 0 ][ sample(1:sum(df$SCORE > 0), 1) ]
#[1] 0.6579631
like image 23
IRTFM Avatar answered Oct 14 '22 09:10

IRTFM