Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Draw random numbers from restricted Pareto distribution

Tags:

random

r

Am a newcomer to R and need advice on how to draw random numbers from a limited area of a Pareto Distribution with parameters s & beta. (System: Windows 7, R 2.15.2.)

(1) I have data in a vector data$t; each single data point I'll call data&tx

For these data the parameters s & beta of a Pareto distribution are estimated following https://stats.stackexchange.com/questions/27426/how-do-i-fit-a-set-of-data-to-a-pareto-distribution-in-r

pareto.MLE <- function(X)
{
n <- length(X)
m <- min(X)
a <- n/sum(log(X)-log(m))
return( c(m,a) ) 
}

(2) Now I need to draw as many random numbers (RndNew) von this Pareto distribution (s, beta, see (1)) as there are observations (= data points: data$tx) . For the draw the area from which random numbers are drawn must be limited to the area where RndNewx >= data$tx; in other words: RndNewx must never be smaller than the corresponding data$tx.

Question: how to tell R to restrict the area of a Pareto distribution from which to draw a random number to be RndNewx >= data$tx?

Thanks a million for any help!

like image 554
user2006697 Avatar asked Jan 24 '13 08:01

user2006697


People also ask

Which R function generates random numbers using the Pareto distribution?

r = gprnd(k,sigma,theta) returns an array of random numbers chosen from the generalized Pareto (GP) distribution with tail index (shape) parameter k , scale parameter sigma , and threshold (location) parameter, theta .


1 Answers

The standard approach to sampling from a truncated distribution has three steps. Here's an example with the normal distribution so you can get the idea.

n <- 1000
lower_bound <- -1
upper_bound <- 1

Apply the CDF to your lower and upper bounds to find the quantiles of the edges of your distribution.

(quantiles <- pnorm(c(lower_bound, upper_bound)))
# [1] 0.1586553 0.8413447

Sample from a uniform distribution between those quantiles.

uniform_random_numbers <- runif(n, quantiles[1], quantiles[2])

Apply the inverse CDF.

truncated_normal_random_numbers <- qnorm(uniform_random_numbers)

truncated normal random numbers


The CDF for the pareto distribution is

ppareto <- function(x, scale, shape)
{
  ifelse(x > scale, 1 - (scale / x) ^ shape, 0)
}

And the inverse is

qpareto <- function(y, scale, shape)
{
  ifelse(
    y >= 0 & y <= 1,
    scale * ((1 - y) ^ (-1 / shape)),
    NaN
  )
}

We can rework the above example to use these Pareto functions.

n <- 1000
scale <- 1
shape <- 1
lower_bound <- 2
upper_bound <- 10

(quantiles <- ppareto(c(lower_bound, upper_bound), scale, shape))
uniform_random_numbers <- runif(n, quantiles[1], quantiles[2])    
truncated_pareto_random_numbers <- qpareto(uniform_random_numbers, scale, shape)

truncated Pareto distrbution


To make it easier to reuse this code, we can wrap it into a function. The lower and upper bounds have default values that match the range of the distribution, so if you don't pass values in, then you'll get a non-truncated Pareto distribution.

rpareto <- function(n, scale, shape, lower_bound = scale, upper_bound = Inf)
{
  quantiles <- ppareto(c(lower_bound, upper_bound), scale, shape)
  uniform_random_numbers <- runif(n, quantiles[1], quantiles[2])    
  qpareto(uniform_random_numbers, scale, shape)
}
like image 83
Richie Cotton Avatar answered Oct 21 '22 06:10

Richie Cotton