Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Producing RNG vectors in R that have pre-defined sum of pdf or sum of cdf

Tags:

random

r

I am a new R user and I am trying to produce vectors with numbers randomly generated based on a specific distribution (with the rnorm command for example) with the vectors having a pre-defined sum of probability densities or sum of cumulative distributions.

For example, when generating vectors x1, x2 … xn I want them to obey either

sum(pnorm(x1)) = sum(pnorm(x2)) = … sum(pnorm(xn))

or

sum(pnorm(xi)) = ”fixed value”

or do the same but with dnorm. In other words, is there a possibility to set such parameters when using rnorm or any other RNG in R?

Tips and suggestions for strategies instead of complete solutions would also be greatly appreciated.

Many thanks in advance for your time.

like image 965
Nicholas Avatar asked Mar 03 '13 20:03

Nicholas


1 Answers

1. In the case of a Gaussian distribution, sampling from (X1,...,Xn) under the condition that X1+...+Xn=s is just sampling from a conditional Gaussian distribution.

The vector (X1,X2,...,Xn,X1+...+Xn) has a Gaussian distribution, with zero mean, and variance matrix

1 0 0 ... 0 1
0 1 0 ... 0 1
0 0 1 ... 0 1
...
0 0 0 ... 1 1
1 1 1 ... 1 n.

We can therefore sample from it as follows.

s <- 1  # Desired sum
n <- 10
mu1 <- rep(0,n)
mu2 <- 0
V11 <- diag(n)
V12 <- as.matrix(rep(1,n))
V21 <- t(V12)
V22 <- as.matrix(n)
mu <- mu1 + V12 %*% solve(V22, s - mu2)
V  <- V11 - V12 %*% solve(V22,V21)
library(mvtnorm)
# Random vectors (in each row)
x <- rmvnorm( 100, mu, V )
# Check the sum and the distribution
apply(x, 1, sum)
hist(x[,1])
qqnorm(x[,1])

For an arbitrary distribution, this approach would require you to compute the conditional distribution, which may not be easy.

2. There is another easy special case: a uniform distribution.

To uniformly sample n (positive) numbers that sum up to 1, you can take n-1 numbers, uniformly in [0,1], and sort them: they define n intervals, whose lengths turn sum up to 1, and happen to be uniformly distributed.

Since those points form a Poisson process, you can also generate them with an exponential distribution.

x <- rexp(n)
x <- x / sum(x)  # Sums to 1, and each coordinate is uniform in [0,1]

This idea is explained (with a lot of pictures) in the following article: Portfolio Optimization for VaR, CVaR, Omega and Utility with General Return Distributions, (W.T. Shaw, 2011), pages 6 to 8.

3. (EDIT) I had initially misread the question, which was asking about sum(pnorm(x)), not sum(x). This turns out to be easier.

If X has a Gaussian distribution, then pnorm(X) has a uniform distribution: the problem is then to sample from a uniform distribution, with a prescribed sum.

n <- 10
s <- 1  # Desired sum
p <- rexp(n)
p <- p / sum(p) * s  # Uniform, sums to s
x <- qnorm(p)        # Gaussian, the p-values sum to s
like image 170
Vincent Zoonekynd Avatar answered Oct 20 '22 01:10

Vincent Zoonekynd