Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate correlated random variables that follow beta distributions

I need to generate random values for two beta-distributed variables that are correlated using SAS. The two variables of interest are characterized as follows:


X1 has mean = 0.896 and variance = 0.001.

X2 has mean = 0.206 and variance = 0.004.

For X1 and X2, p = 0.5, where p is the correlation coefficient.


Using SAS, I understand how to generate a random number specifying a beta distribution using the function X = RAND('BETA', a, b), where a and b are the two shape parameters for a variable X that can be calculated from the mean and variance. However, I want to generate values for both X1 and X2 simultaneously while specifying that they are correlated at p = 0.5.

like image 871
Gavin M. Jones Avatar asked Jul 16 '15 16:07

Gavin M. Jones


People also ask

How to generate a random number specifying a beta distribution?

Using SAS, I understand how to generate a random number specifying a beta distribution using the function X = RAND ('BETA', a, b), where a and b are the two shape parameters for a variable X that can be calculated from the mean and variance.

How do you create 2 correlated random variables with two variables?

If we have 2 normal, uncorrelated random variables $X_1, X_2$ then we can create 2 correlated random variables with the formula $Y=ho X_1+ \sqrt{1-ho^2} X_2$ and then $Y$ will have a correlation $ho$ with $X_1$.

How can I generate correlated normally distributed random samples?

To generate correlated normally distributed random samples, one can first generate uncorrelated samples, and then multiply them by a matrix C such that CCT = R, where R is the desired covariance matrix. C can be created, for example, by using the Cholesky decomposition of R, or from the eigenvalues and eigenvectors of R. In :

How to adjust for correlation between MVN and beta variables?

for the MVN data in order to have the correlation ot the beta variables be 0.5. For Spearman (rank) correlations, no adjustment is necessary.


1 Answers

This solution is based on modified methods used from Chapter 9 of Simulating Data with SAS by Rick Wicklin.

In this particular example, I first have to define variable means, variances, and shape-parameters (alpha, beta) that are associated with the beta distribution:

data beta_corr_vars;
    input x1 var1 x2 var2;  *mean1, variance1, mean2, variance2;
    *calculate shape parameters alpha and beta from means and variances;
    alpha1 = ((1 - x1) / var1 - 1/ x1) * x1**2;   
    alpha2 = ((1 - x2) / var2 - 1/ x2) * x2**2; 
    beta1 = alpha1 * (1 / x1 - 1);
    beta2 = alpha2 * (1 / x2 - 1);
    *here are the means and variances referred to in the original question;
    datalines; 
0.896 0.001 0.206 0.004
;
run;
proc print data = beta_corr_vars;
run;

Once these variables are defined:

proc iml;
  use beta_corr_vars; read all; 
  call randseed(12345);
      N = 10000;                  *number of random variable sets to generate;
      *simulate bivariate normal data with a specified correlation (here, rho = 0.5);
      Z = RandNormal(N, {0, 0}, {1 0.5, 0.5 1});   *RandNormal(N, Mean, Cov);
      *transform the normal variates into uniform variates;
      U = cdf("Normal", Z);      

      *From here, we can obtain beta variates for each column of U by; 
      *applying the inverse beta CDF;
      x1_beta = quantile("Beta", U[,1], alpha1, beta1);        
      x2_beta = quantile("Beta", U[,2], alpha2, beta2); 
      X = x1_beta || x2_beta; 

  *check adequacy of rho values--they approach the desired values with more sims (N);
  rhoZ = corr(Z)[1,2];                
  rhoX = corr(X)[1,2];

print X;
print rhoZ rhoX;

Thank you to all users who contributed to this answer.

like image 163
Gavin M. Jones Avatar answered Sep 20 '22 14:09

Gavin M. Jones