I need to generate random values for two beta-distributed variables that are correlated using SAS. The two variables of interest are characterized as follows: <hr> <code>X1</code> has <code>mean = 0.896</code> and <code>variance = 0.001</code>. <code>X2</code> has <code>mean = 0.206</code> and <code>variance = 0.004</code>. For <code>X1</code> and <code>X2</code>, p = 0.5, where p is the correlation coefficient. <hr> Using SAS, I understand how to generate a random number specifying a beta distribution using the function <code>X = RAND('BETA', a, b)</code>, where a and b are the two shape parameters for a variable X that can be calculated from the mean and variance. However, I want to generate values for both <code>X1</code> and <code>X2</code> simultaneously while specifying that they are correlated at p = 0.5.

This solution is based on modified methods used from Chapter 9 of Simulating Data with SAS by Rick Wicklin. In this particular example, I first have to define variable means, variances, and shape-parameters (alpha, beta) that are associated with the beta distribution: <pre class="prettyprint"><code>data beta_corr_vars; input x1 var1 x2 var2; *mean1, variance1, mean2, variance2; *calculate shape parameters alpha and beta from means and variances; alpha1 = ((1 - x1) / var1 - 1/ x1) * x1**2; alpha2 = ((1 - x2) / var2 - 1/ x2) * x2**2; beta1 = alpha1 * (1 / x1 - 1); beta2 = alpha2 * (1 / x2 - 1); *here are the means and variances referred to in the original question; datalines; 0.896 0.001 0.206 0.004 ; run; proc print data = beta_corr_vars; run; </code></pre> Once these variables are defined: <pre class="prettyprint"><code>proc iml; use beta_corr_vars; read all; call randseed(12345); N = 10000; *number of random variable sets to generate; *simulate bivariate normal data with a specified correlation (here, rho = 0.5); Z = RandNormal(N, {0, 0}, {1 0.5, 0.5 1}); *RandNormal(N, Mean, Cov); *transform the normal variates into uniform variates; U = cdf("Normal", Z); *From here, we can obtain beta variates for each column of U by; *applying the inverse beta CDF; x1_beta = quantile("Beta", U[,1], alpha1, beta1); x2_beta = quantile("Beta", U[,2], alpha2, beta2); X = x1_beta || x2_beta; *check adequacy of rho values--they approach the desired values with more sims (N); rhoZ = corr(Z)[1,2]; rhoX = corr(X)[1,2]; print X; print rhoZ rhoX; </code></pre> Thank you to all users who contributed to this answer.

Generate correlated random variables that follow beta distributions

Tags:

correlation

sas

sas-iml

I need to generate random values for two beta-distributed variables that are correlated using SAS. The two variables of interest are characterized as follows:

X1 has mean = 0.896 and variance = 0.001.

X2 has mean = 0.206 and variance = 0.004.

For X1 and X2, p = 0.5, where p is the correlation coefficient.

Using SAS, I understand how to generate a random number specifying a beta distribution using the function X = RAND('BETA', a, b), where a and b are the two shape parameters for a variable X that can be calculated from the mean and variance. However, I want to generate values for both X1 and X2 simultaneously while specifying that they are correlated at p = 0.5.

871

asked Jul 16 '15 16:07

Gavin M. Jones

1 Answers

This solution is based on modified methods used from Chapter 9 of Simulating Data with SAS by Rick Wicklin.

In this particular example, I first have to define variable means, variances, and shape-parameters (alpha, beta) that are associated with the beta distribution:

data beta_corr_vars;
    input x1 var1 x2 var2;  *mean1, variance1, mean2, variance2;
    *calculate shape parameters alpha and beta from means and variances;
    alpha1 = ((1 - x1) / var1 - 1/ x1) * x1**2;   
    alpha2 = ((1 - x2) / var2 - 1/ x2) * x2**2; 
    beta1 = alpha1 * (1 / x1 - 1);
    beta2 = alpha2 * (1 / x2 - 1);
    *here are the means and variances referred to in the original question;
    datalines; 
0.896 0.001 0.206 0.004
;
run;
proc print data = beta_corr_vars;
run;

Once these variables are defined:

proc iml;
  use beta_corr_vars; read all; 
  call randseed(12345);
      N = 10000;                  *number of random variable sets to generate;
      *simulate bivariate normal data with a specified correlation (here, rho = 0.5);
      Z = RandNormal(N, {0, 0}, {1 0.5, 0.5 1});   *RandNormal(N, Mean, Cov);
      *transform the normal variates into uniform variates;
      U = cdf("Normal", Z);      

      *From here, we can obtain beta variates for each column of U by; 
      *applying the inverse beta CDF;
      x1_beta = quantile("Beta", U[,1], alpha1, beta1);        
      x2_beta = quantile("Beta", U[,2], alpha2, beta2); 
      X = x1_beta || x2_beta; 

  *check adequacy of rho values--they approach the desired values with more sims (N);
  rhoZ = corr(Z)[1,2];                
  rhoX = corr(X)[1,2];

print X;
print rhoZ rhoX;

Thank you to all users who contributed to this answer.

163

answered Sep 20 '22 14:09

Gavin M. Jones

Related questions
                            
                                convert a SAS datetime in Pandas
                            
                                Test if a variable exists
                            
                                How to determine the length of an array?
                            
                                How can I import SAS format files into R?
                            
                                Multiple WHERE IN that has to stay separated
                            
                                SQL Passthrough in SAS
                            
                                How can I use Proc SQL to find all the records that only exist in one table but not the other?
                            
                                How to use SAS to split a string into two variables
                            
                                Does SAS have a equivalent function to all() or any() in R
                            
                                Using SAS to copy a text file
                            
                                SAS macro for laboratory values
                            
                                Run SAS from VBA with full access
                            
                                Standard errors discrepancies between SAS and R for GLM gamma distribution
                            
                                SAS, programmatically export metadata object spks
                            
                                SAS Pass-through SQL - Multiple DBs
                            
                                PROC SQL in SAS - All Pairs of Items
                            
                                SAS . Are variables set to missing at every iteration of a data step?
                            
                                How to setup Apache Spark to use local hard disk when data does not fit in RAM in local mode?
                            
                                Editing SAS config files to execute R (making SAS play well with others)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Generate correlated random variables that follow beta distributions

Tags:

correlation

sas

sas-iml

Gavin M. Jones

People also ask

1 Answers

Gavin M. Jones

Recent Activity

Donate For Us