Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simulate data from (non-standard) density function

I want to simulate data from a non-standard density function. I already found the following link (How do I best simulate an arbitrary univariate random variate using its probability function?). However, this gives weird results. Somehow, this cumulative density function ( cdf() ) does not work well. From some values, it gives very strange results. For example, take a look at the following code:

density=function(x)(25*200.7341^25/x^26*exp(-(200.7341/x)^25))
cdf<-function(x) integrate(density,1,x)[[1]]

cdf(9701)
[1] 1

cdf(9702)
[1] 6.33897e-05

So my question, how can I create a "good" CDF function? Or more directly, how can I simulate data from a PDF?

like image 595
Sjoerd Glaser Avatar asked Jan 13 '23 14:01

Sjoerd Glaser


1 Answers

As pointed out by @pjs we can use Rejection sampling (check the wiki for details).

Here is one implementation of this approach.

The most important step is to find a distribution g from which we can sample and from which it exists M such that M * g > f for all point

f <- function(x) (25 * 200.7341^25 / x^26 * exp(-(200.7341/x)^25))
g <- function(x) dnorm(x, mean = 200.7341, sd = 40)
M <- 5
curve(f, 0, 500)
curve(M * g(x), 0, 500, add = TRUE, lty = "dashed")

enter image description here

Now, we can execute the algorithm

set.seed(42)
k <- 1
count <- 0
res <- vector(mode = "numeric", length = 1000)
while(k < 1001) {
          z <- rnorm(n = 1, mean = 200.7341, sd = 40)
          R <- f(z) / (M * g(z))
          if (R > runif(1)) {
              res[k] <- z
              k  <- k + 1
          }
          count <- count + 1
    }

(accept_rate <- (k / count) * 100)
## [1] 19.7086

require(MASS) ## for truehist
truehist(res)
curve(f, 0, 250, add = TRUE)

enter image description here

The acceptance rate is not great. You can try do find a better envelope function or use a Metropolis Hasting algorithm.

like image 70
dickoa Avatar answered Jan 19 '23 10:01

dickoa