I'm working to implement a basic Monte Carlo simulator in Python for some project management risk modeling I'm trying to do (basically Crystal Ball / @Risk, but in Python). I have a set of <code>n</code> random variables (all <code>scipy.stats</code> instances). I know that I can use <code>rv.rvs(size=k)</code> to generate <code>k</code> independent observations from each of these <code>n</code> variables. I'd like to introduce correlations among the variables by specifying an <code>n x n</code> positive semi-definite correlation matrix. Is there a clean way to do this in scipy? What I've Tried This answer and this answer seem to indicate that "copulas" would be an answer, but I don't see any reference in scipy to them. This link seems to implement what I'm looking for, but I'm not sure if scipy has this functionality implemented already. I'd also like it to work for non-normal variables. It seems that the Iman, Conover paper is the standard method.

If you just want correlation through a Gaussian Copula (*), then it can be calculated in a few steps with numpy and scipy. <ul> <li>create multivariate random variables with desired covariance, <code>numpy.random.multivariate_normal</code>, and creating a (nobs by k_variables) array</li> <li>apply <code>scipy.stats.norm.cdf</code> to transform normal to uniform random variables, for each column/variable to get uniform marginal distributions</li> <li>apply <code>dist.ppf</code> to transform uniform margin to the desired distribution, where <code>dist</code> can be one of the distributions in <code>scipy.stats</code></li> </ul> (*) Gaussian copula is only one choice and it is not the best when we are interested in tail behavior, but it is the easiest to work with for example http://archive.wired.com/techbiz/it/magazine/17-03/wp_quant?currentPage=all two references https://stats.stackexchange.com/questions/37424/how-to-simulate-from-a-gaussian-copula http://www.mathworks.com/products/demos/statistics/copulademo.html (I might have done this a while ago in python, but don't have any scripts or function right now.)

It seems like a rejection-based sampling method such as the Metropolis-Hastings algorithm is what you want. Scipy can implement such methods with its scipy.optimize.basinhopping function. Rejection-based sampling methods allow you to draw samples from any given probability distribution. The idea is that you draw random samples from another "proposal" pdf that is easy to sample from (such as uniform or gaussian distributions) and then use a random test to decide if this sample from the proposal distribution should be "accepted" as representing a sample of the desired distribution. The remaining tricks will then be: <ol> <li>Figure out the form of the joint N-dimensional probability density function which has marginals of the form you want along each dimension, but with the correlation matrix that you want. This is easy to do for the Gaussian distribution, where the desired correlation matrix and mean vector is all you need to define the distribution. If your marginals have a simple expression, you can probably find this pdf with some straightforward-but-tedious algebra. This paper cites several others which do what you are talking about, and I'm certain that there are many more. </li> <li>Formulate a function for <code>basinhopping</code> to minimize such that it's accepted "minima" amount to samples of this pdf you have defined. </li> </ol> Given the results of (1), (2) should be straightforward.

scipy - generate random variables with correlations

2 Answers

If you just want correlation through a Gaussian Copula (*), then it can be calculated in a few steps with numpy and scipy.

create multivariate random variables with desired covariance, numpy.random.multivariate_normal, and creating a (nobs by k_variables) array
apply scipy.stats.norm.cdf to transform normal to uniform random variables, for each column/variable to get uniform marginal distributions
apply dist.ppf to transform uniform margin to the desired distribution, where dist can be one of the distributions in scipy.stats

(*) Gaussian copula is only one choice and it is not the best when we are interested in tail behavior, but it is the easiest to work with for example http://archive.wired.com/techbiz/it/magazine/17-03/wp_quant?currentPage=all

two references

https://stats.stackexchange.com/questions/37424/how-to-simulate-from-a-gaussian-copula

http://www.mathworks.com/products/demos/statistics/copulademo.html

(I might have done this a while ago in python, but don't have any scripts or function right now.)

answered Oct 21 '22 05:10

Josef

It seems like a rejection-based sampling method such as the Metropolis-Hastings algorithm is what you want. Scipy can implement such methods with its scipy.optimize.basinhopping function.

Rejection-based sampling methods allow you to draw samples from any given probability distribution. The idea is that you draw random samples from another "proposal" pdf that is easy to sample from (such as uniform or gaussian distributions) and then use a random test to decide if this sample from the proposal distribution should be "accepted" as representing a sample of the desired distribution.

The remaining tricks will then be:

Figure out the form of the joint N-dimensional probability density function which has marginals of the form you want along each dimension, but with the correlation matrix that you want. This is easy to do for the Gaussian distribution, where the desired correlation matrix and mean vector is all you need to define the distribution. If your marginals have a simple expression, you can probably find this pdf with some straightforward-but-tedious algebra. This paper cites several others which do what you are talking about, and I'm certain that there are many more.
Formulate a function for basinhopping to minimize such that it's accepted "minima" amount to samples of this pdf you have defined.

Given the results of (1), (2) should be straightforward.

answered Oct 21 '22 05:10

stochastic

Related questions
                            
                                Error 32, Python, file being used by another process
                            
                                Sorting a List of Elements by the First element but if Equal sort by the second
                            
                                Supervising celerybeat with supervisor and virtualenv
                            
                                Determining Amazon EC2 instance creation date/time
                            
                                Prevent or dismiss 'empty file' warning in loadtxt
                            
                                Python.framework is missing from OS X 10.9 SDK. Why? Also: Workaround?
                            
                                Django-tables2: How to use accessor to bring in foreign columns?
                            
                                How to handle times with a time zone in Matplotlib?
                            
                                Do not print "optimization terminated successfully" scipy.optimize.fmin?
                            
                                'Self' of python vs 'this' of cpp/c#
                            
                                numpy array is printed into file with unwanted wrapping
                            
                                Bad disparity map using StereoBM in OpenCV
                            
                                OSError: [Errno 22] Invalid argument in subprocess
                            
                                cpu_percent(interval=None) always returns 0 regardless of interval value PYTHON
                            
                                Is it possible to create grouping of input cells in IPython Notebook?
                            
                                Generate a random derangement of a list
                            
                                Linking Django and Postgresql with Docker
                            
                                Python Pandas: Passing Multiple Functions to agg() with Arguments
                            
                                Flatten DataFrame with multi-index columns
                            
                                Python Selenium get current window handle

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

scipy - generate random variables with correlations

Tags:

python

numpy

scipy

MikeRand

People also ask

2 Answers

Josef

stochastic

Recent Activity

Donate For Us