<blockquote> This question led to a new R package: <code>wrswoR</code> </blockquote> R's default sampling without replacement using <code>sample.int</code> seems to require quadratic run time, e.g. when using weights drawn from a uniform distribution. This is slow for large sample sizes. Does anybody know a faster implementation that would be usable from within R? Two options are "rejection sampling with replacement" (see this question on stats.sx) and the algorithm by Wong and Easton (1980) (with a Python implementation in a StackOverflow answer). Thanks to Ben Bolker for hinting at the C function that is called internally when <code>sample.int</code> is called with <code>replace=F</code> and non-uniform weights: <code>ProbSampleNoReplace</code>. Indeed, the code shows two nested <code>for</code> loops (line 420 ff of <code>random.c</code>). Here's the code to analyze the run time empirically: <pre class="prettyprint"><code>library(plyr) sample.int.test <- function(n, p) { sample.int(2 * n, n, replace=F, prob=p); NULL } times <- ldply( 1:7, function(i) { n <- 1024 * (2 ** i) p <- runif(2 * n) data.frame( n=n, user=system.time(sample.int.test(n, p), gcFirst=T)['user.self']) }, .progress='text' ) times library(ggplot2) ggplot(times, aes(x=n, y=user/n)) + geom_point() + scale_x_log10() + ylab('Time per unit (s)') # Output: n user 1 2048 0.008 2 4096 0.028 3 8192 0.100 4 16384 0.408 5 32768 1.645 6 65536 6.604 7 131072 26.558 </code></pre> <img src="https://i.stack.imgur.com/Gibaj.png" alt="Plot"> EDIT: Thanks to Arun for pointing out that unweighted sampling doesn't seem to have this performance penalty.

Update: An <code>Rcpp</code> implementation of Efraimidis & Spirakis algorithm (thanks to @Hemmo, @Dinrem, @krlmlr and @rtlgrmpf): <pre class="prettyprint"><code>library(inline) library(Rcpp) src <- ' int num = as<int>(size), x = as<int>(n); Rcpp::NumericVector vx = Rcpp::clone<Rcpp::NumericVector>(x); Rcpp::NumericVector pr = Rcpp::clone<Rcpp::NumericVector>(prob); Rcpp::NumericVector rnd = rexp(x) / pr; for(int i= 0; i<vx.size(); ++i) vx[i] = i; std::partial_sort(vx.begin(), vx.begin() + num, vx.end(), Comp(rnd)); vx = vx[seq(0, num - 1)] + 1; return vx; ' incl <- ' struct Comp{ Comp(const Rcpp::NumericVector& v ) : _v(v) {} bool operator ()(int a, int b) { return _v[a] < _v[b]; } const Rcpp::NumericVector& _v; }; ' funFast <- cxxfunction(signature(n = "Numeric", size = "integer", prob = "numeric"), src, plugin = "Rcpp", include = incl) # See the bottom of the answer for comparison p <- c(995/1000, rep(1/1000, 5)) n <- 100000 system.time(print(table(replicate(funFast(6, 3, p), n = n)) / n)) 1 2 3 4 5 6 1.00000 0.39996 0.39969 0.39973 0.40180 0.39882 user system elapsed 3.93 0.00 3.96 # In case of: # Rcpp::IntegerVector vx = Rcpp::clone<Rcpp::IntegerVector>(x); # i.e. instead of NumericVector 1 2 3 4 5 6 1.00000 0.40150 0.39888 0.39925 0.40057 0.39980 user system elapsed 1.93 0.00 2.03 </code></pre> Old version: Let us try a few possible approaches: Simple rejection sampling with replacement. This a far more simple function than <code>sample.int.rej</code> offered by @krlmlr, i.e. sample size is always equal to <code>n</code>. As we will see, it is still really fast assuming uniform distribution for weights, but extremely slow in another situation. <pre class="prettyprint"><code>fastSampleReject <- function(all, n, w){ out <- numeric(0) while(length(out) < n) out <- unique(c(out, sample(all, n, replace = TRUE, prob = w))) out[1:n] } </code></pre> The algorithm by Wong and Easton (1980). Here is an implementation of this Python version. It is stable and I might be missing something, but it is much slower compared to other functions. <pre class="prettyprint"><code>fastSample1980 <- function(all, n, w){ tws <- w for(i in (length(tws) - 1):0) tws[1 + i] <- sum(tws[1 + i], tws[1 + 2 * i + 1], tws[1 + 2 * i + 2], na.rm = TRUE) out <- numeric(n) for(i in 1:n){ gas <- tws[1] * runif(1) k <- 0 while(gas > w[1 + k]){ gas <- gas - w[1 + k] k <- 2 * k + 1 if(gas > tws[1 + k]){ gas <- gas - tws[1 + k] k <- k + 1 } } wgh <- w[1 + k] out[i] <- all[1 + k] w[1 + k] <- 0 while(1 + k >= 1){ tws[1 + k] <- tws[1 + k] - wgh k <- floor((k - 1) / 2) } } out } </code></pre> Rcpp implementation of the algorithm by Wong and Easton. Possibly it can be optimized even more since this is my first usable <code>Rcpp</code> function, but anyway it works well. <pre class="prettyprint"><code>library(inline) library(Rcpp) src <- ' Rcpp::NumericVector weights = Rcpp::clone<Rcpp::NumericVector>(w); Rcpp::NumericVector tws = Rcpp::clone<Rcpp::NumericVector>(w); Rcpp::NumericVector x = Rcpp::NumericVector(all); int k, num = as<int>(n); Rcpp::NumericVector out(num); double gas, wgh; if((weights.size() - 1) % 2 == 0){ tws[((weights.size()-1)/2)] += tws[weights.size()-1] + tws[weights.size()-2]; } else { tws[floor((weights.size() - 1)/2)] += tws[weights.size() - 1]; } for (int i = (floor((weights.size() - 1)/2) - 1); i >= 0; i--){ tws[i] += (tws[2 * i + 1]) + (tws[2 * i + 2]); } for(int i = 0; i < num; i++){ gas = as<double>(runif(1)) * tws[0]; k = 0; while(gas > weights[k]){ gas -= weights[k]; k = 2 * k + 1; if(gas > tws[k]){ gas -= tws[k]; k += 1; } } wgh = weights[k]; out[i] = x[k]; weights[k] = 0; while(k > 0){ tws[k] -= wgh; k = floor((k - 1) / 2); } tws[0] -= wgh; } return out; ' fun <- cxxfunction(signature(all = "numeric", n = "integer", w = "numeric"), src, plugin = "Rcpp") </code></pre> Now some results: <pre class="prettyprint"><code>times1 <- ldply( 1:6, function(i) { n <- 1024 * (2 ** i) p <- runif(2 * n) # Uniform distribution p <- p/sum(p) data.frame( n=n, user=c(system.time(sample.int.test(n, p), gcFirst=T)['user.self'], system.time(weighted_Random_Sample(1:(2*n), p, n), gcFirst=T)['user.self'], system.time(fun(1:(2*n), n, p), gcFirst=T)['user.self'], system.time(sample.int.rej(2*n, n, p), gcFirst=T)['user.self'], system.time(fastSampleReject(1:(2*n), n, p), gcFirst=T)['user.self'], system.time(fastSample1980(1:(2*n), n, p), gcFirst=T)['user.self']), id=c("Base", "Reservoir", "Rcpp", "Rejection", "Rejection simple", "1980")) }, .progress='text' ) times2 <- ldply( 1:6, function(i) { n <- 1024 * (2 ** i) p <- runif(2 * n - 1) p <- p/sum(p) p <- c(0.999, 0.001 * p) # Special case data.frame( n=n, user=c(system.time(sample.int.test(n, p), gcFirst=T)['user.self'], system.time(weighted_Random_Sample(1:(2*n), p, n), gcFirst=T)['user.self'], system.time(fun(1:(2*n), n, p), gcFirst=T)['user.self'], system.time(sample.int.rej(2*n, n, p), gcFirst=T)['user.self'], system.time(fastSampleReject(1:(2*n), n, p), gcFirst=T)['user.self'], system.time(fastSample1980(1:(2*n), n, p), gcFirst=T)['user.self']), id=c("Base", "Reservoir", "Rcpp", "Rejection", "Rejection simple", "1980")) }, .progress='text' ) </code></pre> <img src="https://i.stack.imgur.com/CwzvK.png" alt="enter image description here"> <img src="https://i.stack.imgur.com/AlOdB.png" alt="enter image description here"> <pre class="prettyprint"><code>arrange(times1, id) n user id 1 2048 0.53 1980 2 4096 0.94 1980 3 8192 2.00 1980 4 16384 4.32 1980 5 32768 9.10 1980 6 65536 21.32 1980 7 2048 0.02 Base 8 4096 0.05 Base 9 8192 0.18 Base 10 16384 0.75 Base 11 32768 2.99 Base 12 65536 12.23 Base 13 2048 0.00 Rcpp 14 4096 0.01 Rcpp 15 8192 0.03 Rcpp 16 16384 0.07 Rcpp 17 32768 0.14 Rcpp 18 65536 0.31 Rcpp 19 2048 0.00 Rejection 20 4096 0.00 Rejection 21 8192 0.00 Rejection 22 16384 0.02 Rejection 23 32768 0.02 Rejection 24 65536 0.03 Rejection 25 2048 0.00 Rejection simple 26 4096 0.01 Rejection simple 27 8192 0.00 Rejection simple 28 16384 0.01 Rejection simple 29 32768 0.00 Rejection simple 30 65536 0.05 Rejection simple 31 2048 0.00 Reservoir 32 4096 0.00 Reservoir 33 8192 0.00 Reservoir 34 16384 0.02 Reservoir 35 32768 0.03 Reservoir 36 65536 0.05 Reservoir arrange(times2, id) n user id 1 2048 0.43 1980 2 4096 0.93 1980 3 8192 2.00 1980 4 16384 4.36 1980 5 32768 9.08 1980 6 65536 19.34 1980 7 2048 0.01 Base 8 4096 0.04 Base 9 8192 0.18 Base 10 16384 0.75 Base 11 32768 3.11 Base 12 65536 12.04 Base 13 2048 0.01 Rcpp 14 4096 0.02 Rcpp 15 8192 0.03 Rcpp 16 16384 0.08 Rcpp 17 32768 0.15 Rcpp 18 65536 0.33 Rcpp 19 2048 0.00 Rejection 20 4096 0.00 Rejection 21 8192 0.02 Rejection 22 16384 0.02 Rejection 23 32768 0.05 Rejection 24 65536 0.08 Rejection 25 2048 1.43 Rejection simple 26 4096 2.87 Rejection simple 27 8192 6.17 Rejection simple 28 16384 13.68 Rejection simple 29 32768 29.74 Rejection simple 30 65536 73.32 Rejection simple 31 2048 0.00 Reservoir 32 4096 0.00 Reservoir 33 8192 0.02 Reservoir 34 16384 0.02 Reservoir 35 32768 0.02 Reservoir 36 65536 0.04 Reservoir </code></pre> Obviously we can reject function <code>1980</code> because it is slower than <code>Base</code> in both cases. <code>Rejection simple</code> gets into trouble too when there is a single probability 0.999 in the second case. So there remains <code>Rejection</code>, <code>Rcpp</code>, <code>Reservoir</code>. The last step is checking whether the values themselves are correct. To be sure about them, we will be using <code>sample</code> as a benchmark (also to eliminate the confusion about probabilities which do not have to coincide with <code>p</code> because of sampling without replacement). <pre class="prettyprint"><code>p <- c(995/1000, rep(1/1000, 5)) n <- 100000 system.time(print(table(replicate(sample(1:6, 3, repl = FALSE, prob = p), n = n))/n)) 1 2 3 4 5 6 1.00000 0.39992 0.39886 0.40088 0.39711 0.40323 # Benchmark user system elapsed 1.90 0.00 2.03 system.time(print(table(replicate(sample.int.rej(2*3, 3, p), n = n))/n)) 1 2 3 4 5 6 1.00000 0.40007 0.40099 0.39962 0.40153 0.39779 user system elapsed 76.02 0.03 77.49 # Slow system.time(print(table(replicate(weighted_Random_Sample(1:6, p, 3), n = n))/n)) 1 2 3 4 5 6 1.00000 0.49535 0.41484 0.36432 0.36338 0.36211 # Incorrect user system elapsed 3.64 0.01 3.67 system.time(print(table(replicate(fun(1:6, 3, p), n = n))/n)) 1 2 3 4 5 6 1.00000 0.39876 0.40031 0.40219 0.40039 0.39835 user system elapsed 4.41 0.02 4.47 </code></pre> Notice a few things here. For some reason <code>weighted_Random_Sample</code> returns incorrect values (I have not looked into it at all, but it works correct assuming uniform distribution). <code>sample.int.rej</code> is very slow in repeated sampling. In conclusion, it seems that <code>Rcpp</code> is the optimal choice in case of repeated sampling while <code>sample.int.rej</code> is a bit faster otherwise and also easier to use.

Faster weighted sampling without replacement

Tags:

performance

algorithm

r

This question led to a new R package: wrswoR

R's default sampling without replacement using sample.int seems to require quadratic run time, e.g. when using weights drawn from a uniform distribution. This is slow for large sample sizes. Does anybody know a faster implementation that would be usable from within R? Two options are "rejection sampling with replacement" (see this question on stats.sx) and the algorithm by Wong and Easton (1980) (with a Python implementation in a StackOverflow answer).

Thanks to Ben Bolker for hinting at the C function that is called internally when sample.int is called with replace=F and non-uniform weights: ProbSampleNoReplace. Indeed, the code shows two nested for loops (line 420 ff of random.c).

Here's the code to analyze the run time empirically:

library(plyr)  sample.int.test <- function(n, p) {     sample.int(2 * n, n, replace=F, prob=p); NULL }  times <- ldply(   1:7,   function(i) {     n <- 1024 * (2 ** i)     p <- runif(2 * n)     data.frame(       n=n,       user=system.time(sample.int.test(n, p), gcFirst=T)['user.self'])   },   .progress='text' )  times  library(ggplot2) ggplot(times, aes(x=n, y=user/n)) + geom_point() + scale_x_log10() +   ylab('Time per unit (s)')  # Output:        n   user 1   2048  0.008 2   4096  0.028 3   8192  0.100 4  16384  0.408 5  32768  1.645 6  65536  6.604 7 131072 26.558

Plot

EDIT: Thanks to Arun for pointing out that unweighted sampling doesn't seem to have this performance penalty.

209

asked Feb 27 '13 13:02

krlmlr

1 Answers

Update:

An Rcpp implementation of Efraimidis & Spirakis algorithm (thanks to @Hemmo, @Dinrem, @krlmlr and @rtlgrmpf):

library(inline) library(Rcpp) src <-  ' int num = as<int>(size), x = as<int>(n); Rcpp::NumericVector vx = Rcpp::clone<Rcpp::NumericVector>(x); Rcpp::NumericVector pr = Rcpp::clone<Rcpp::NumericVector>(prob); Rcpp::NumericVector rnd = rexp(x) / pr; for(int i= 0; i<vx.size(); ++i) vx[i] = i; std::partial_sort(vx.begin(), vx.begin() + num, vx.end(), Comp(rnd)); vx = vx[seq(0, num - 1)] + 1; return vx; ' incl <-  ' struct Comp{   Comp(const Rcpp::NumericVector& v ) : _v(v) {}   bool operator ()(int a, int b) { return _v[a] < _v[b]; }   const Rcpp::NumericVector& _v; }; ' funFast <- cxxfunction(signature(n = "Numeric", size = "integer", prob = "numeric"),                        src, plugin = "Rcpp", include = incl)  # See the bottom of the answer for comparison p <- c(995/1000, rep(1/1000, 5)) n <- 100000 system.time(print(table(replicate(funFast(6, 3, p), n = n)) / n))        1       2       3       4       5       6  1.00000 0.39996 0.39969 0.39973 0.40180 0.39882     user  system elapsed     3.93    0.00    3.96  # In case of: # Rcpp::IntegerVector vx = Rcpp::clone<Rcpp::IntegerVector>(x); # i.e. instead of NumericVector       1       2       3       4       5       6  1.00000 0.40150 0.39888 0.39925 0.40057 0.39980     user  system elapsed     1.93    0.00    2.03

Old version:

Let us try a few possible approaches:

Simple rejection sampling with replacement. This a far more simple function than sample.int.rej offered by @krlmlr, i.e. sample size is always equal to n. As we will see, it is still really fast assuming uniform distribution for weights, but extremely slow in another situation.

fastSampleReject <- function(all, n, w){   out <- numeric(0)   while(length(out) < n)     out <- unique(c(out, sample(all, n, replace = TRUE, prob = w)))   out[1:n] }

The algorithm by Wong and Easton (1980). Here is an implementation of this Python version. It is stable and I might be missing something, but it is much slower compared to other functions.

fastSample1980 <- function(all, n, w){   tws <- w   for(i in (length(tws) - 1):0)     tws[1 + i] <- sum(tws[1 + i], tws[1 + 2 * i + 1],                        tws[1 + 2 * i + 2], na.rm = TRUE)         out <- numeric(n)   for(i in 1:n){     gas <- tws[1] * runif(1)     k <- 0             while(gas > w[1 + k]){       gas <- gas - w[1 + k]       k <- 2 * k + 1       if(gas > tws[1 + k]){         gas <- gas - tws[1 + k]         k <- k + 1       }     }     wgh <- w[1 + k]     out[i] <- all[1 + k]             w[1 + k] <- 0     while(1 + k >= 1){       tws[1 + k] <- tws[1 + k] - wgh       k <- floor((k - 1) / 2)     }   }   out }

Rcpp implementation of the algorithm by Wong and Easton. Possibly it can be optimized even more since this is my first usable Rcpp function, but anyway it works well.

library(inline) library(Rcpp)  src <- ' Rcpp::NumericVector weights = Rcpp::clone<Rcpp::NumericVector>(w); Rcpp::NumericVector tws = Rcpp::clone<Rcpp::NumericVector>(w); Rcpp::NumericVector x = Rcpp::NumericVector(all); int k, num = as<int>(n); Rcpp::NumericVector out(num); double gas, wgh;  if((weights.size() - 1) % 2 == 0){   tws[((weights.size()-1)/2)] += tws[weights.size()-1] + tws[weights.size()-2]; } else {   tws[floor((weights.size() - 1)/2)] += tws[weights.size() - 1]; }  for (int i = (floor((weights.size() - 1)/2) - 1); i >= 0; i--){   tws[i] += (tws[2 * i + 1]) + (tws[2 * i + 2]); } for(int i = 0; i < num; i++){   gas = as<double>(runif(1)) * tws[0];   k = 0;   while(gas > weights[k]){     gas -= weights[k];     k = 2 * k + 1;     if(gas > tws[k]){       gas -= tws[k];       k += 1;     }   }   wgh = weights[k];   out[i] = x[k];   weights[k] = 0;   while(k > 0){     tws[k] -= wgh;     k = floor((k - 1) / 2);   }   tws[0] -= wgh; } return out; '  fun <- cxxfunction(signature(all = "numeric", n = "integer", w = "numeric"),                    src, plugin = "Rcpp")

Now some results:

times1 <- ldply(   1:6,   function(i) {     n <- 1024 * (2 ** i)     p <- runif(2 * n) # Uniform distribution     p <- p/sum(p)     data.frame(       n=n,       user=c(system.time(sample.int.test(n, p), gcFirst=T)['user.self'],              system.time(weighted_Random_Sample(1:(2*n), p, n), gcFirst=T)['user.self'],              system.time(fun(1:(2*n), n, p), gcFirst=T)['user.self'],              system.time(sample.int.rej(2*n, n, p), gcFirst=T)['user.self'],              system.time(fastSampleReject(1:(2*n), n, p), gcFirst=T)['user.self'],              system.time(fastSample1980(1:(2*n), n, p), gcFirst=T)['user.self']),       id=c("Base", "Reservoir", "Rcpp", "Rejection", "Rejection simple", "1980"))   },   .progress='text' )   times2 <- ldply(   1:6,   function(i) {     n <- 1024 * (2 ** i)     p <- runif(2 * n - 1)     p <- p/sum(p)      p <- c(0.999, 0.001 * p) # Special case     data.frame(       n=n,       user=c(system.time(sample.int.test(n, p), gcFirst=T)['user.self'],              system.time(weighted_Random_Sample(1:(2*n), p, n), gcFirst=T)['user.self'],              system.time(fun(1:(2*n), n, p), gcFirst=T)['user.self'],              system.time(sample.int.rej(2*n, n, p), gcFirst=T)['user.self'],              system.time(fastSampleReject(1:(2*n), n, p), gcFirst=T)['user.self'],              system.time(fastSample1980(1:(2*n), n, p), gcFirst=T)['user.self']),       id=c("Base", "Reservoir", "Rcpp", "Rejection", "Rejection simple", "1980"))   },   .progress='text' )

enter image description here

arrange(times1, id)        n  user               id 1   2048  0.53             1980 2   4096  0.94             1980 3   8192  2.00             1980 4  16384  4.32             1980 5  32768  9.10             1980 6  65536 21.32             1980 7   2048  0.02             Base 8   4096  0.05             Base 9   8192  0.18             Base 10 16384  0.75             Base 11 32768  2.99             Base 12 65536 12.23             Base 13  2048  0.00             Rcpp 14  4096  0.01             Rcpp 15  8192  0.03             Rcpp 16 16384  0.07             Rcpp 17 32768  0.14             Rcpp 18 65536  0.31             Rcpp 19  2048  0.00        Rejection 20  4096  0.00        Rejection 21  8192  0.00        Rejection 22 16384  0.02        Rejection 23 32768  0.02        Rejection 24 65536  0.03        Rejection 25  2048  0.00 Rejection simple 26  4096  0.01 Rejection simple 27  8192  0.00 Rejection simple 28 16384  0.01 Rejection simple 29 32768  0.00 Rejection simple 30 65536  0.05 Rejection simple 31  2048  0.00        Reservoir 32  4096  0.00        Reservoir 33  8192  0.00        Reservoir 34 16384  0.02        Reservoir 35 32768  0.03        Reservoir 36 65536  0.05        Reservoir  arrange(times2, id)        n  user               id 1   2048  0.43             1980 2   4096  0.93             1980 3   8192  2.00             1980 4  16384  4.36             1980 5  32768  9.08             1980 6  65536 19.34             1980 7   2048  0.01             Base 8   4096  0.04             Base 9   8192  0.18             Base 10 16384  0.75             Base 11 32768  3.11             Base 12 65536 12.04             Base 13  2048  0.01             Rcpp 14  4096  0.02             Rcpp 15  8192  0.03             Rcpp 16 16384  0.08             Rcpp 17 32768  0.15             Rcpp 18 65536  0.33             Rcpp 19  2048  0.00        Rejection 20  4096  0.00        Rejection 21  8192  0.02        Rejection 22 16384  0.02        Rejection 23 32768  0.05        Rejection 24 65536  0.08        Rejection 25  2048  1.43 Rejection simple 26  4096  2.87 Rejection simple 27  8192  6.17 Rejection simple 28 16384 13.68 Rejection simple 29 32768 29.74 Rejection simple 30 65536 73.32 Rejection simple 31  2048  0.00        Reservoir 32  4096  0.00        Reservoir 33  8192  0.02        Reservoir 34 16384  0.02        Reservoir 35 32768  0.02        Reservoir 36 65536  0.04        Reservoir

Obviously we can reject function 1980 because it is slower than Base in both cases. Rejection simple gets into trouble too when there is a single probability 0.999 in the second case.

So there remains Rejection, Rcpp, Reservoir. The last step is checking whether the values themselves are correct. To be sure about them, we will be using sample as a benchmark (also to eliminate the confusion about probabilities which do not have to coincide with p because of sampling without replacement).

p <- c(995/1000, rep(1/1000, 5)) n <- 100000  system.time(print(table(replicate(sample(1:6, 3, repl = FALSE, prob = p), n = n))/n))       1       2       3       4       5       6  1.00000 0.39992 0.39886 0.40088 0.39711 0.40323  # Benchmark    user  system elapsed     1.90    0.00    2.03   system.time(print(table(replicate(sample.int.rej(2*3, 3, p), n = n))/n))       1       2       3       4       5       6  1.00000 0.40007 0.40099 0.39962 0.40153 0.39779     user  system elapsed    76.02    0.03   77.49 # Slow  system.time(print(table(replicate(weighted_Random_Sample(1:6, p, 3), n = n))/n))       1       2       3       4       5       6  1.00000 0.49535 0.41484 0.36432 0.36338 0.36211  # Incorrect    user  system elapsed     3.64    0.01    3.67   system.time(print(table(replicate(fun(1:6, 3, p), n = n))/n))       1       2       3       4       5       6  1.00000 0.39876 0.40031 0.40219 0.40039 0.39835     user  system elapsed     4.41    0.02    4.47

Notice a few things here. For some reason weighted_Random_Sample returns incorrect values (I have not looked into it at all, but it works correct assuming uniform distribution). sample.int.rej is very slow in repeated sampling.

In conclusion, it seems that Rcpp is the optimal choice in case of repeated sampling while sample.int.rej is a bit faster otherwise and also easier to use.

124

answered Oct 11 '22 13:10

Julius Vainora

Related questions
                            
                                as.numeric with comma decimal separators?
                            
                                Unable to update R packages in default library on Windows 7
                            
                                Error converting text to lowercase with tm_map(..., tolower)
                            
                                How to initialize a vector with fixed length in R
                            
                                How to add whitespace to an RMarkdown document?
                            
                                How to debug "contrasts can be applied only to factors with 2 or more levels" error?
                            
                                Difference between subset and filter from dplyr
                            
                                Remove numbers from alphanumeric characters
                            
                                Get date difference in years (floating point)
                            
                                Calculating percentile of dataset column
                            
                                Dictionary style replace multiple items
                            
                                replacing NA's with 0's in R dataframe [duplicate]
                            
                                Replace contents of factor column in R dataframe
                            
                                Put stars on ggplot barplots and boxplots - to indicate the level of significance (p-value)
                            
                                How to round a data.frame in R that contains some character variables?
                            
                                Join inherited scope with by in data.table
                            
                                R data.table sliding window
                            
                                Network chord diagram woes in R
                            
                                Code folding in bookdown
                            
                                Does roxygen2 automatically write NAMESPACE directives for "Imports:" packages?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With