Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Reverse" statistics: generating data based on mean and standard deviation

Having a dataset and calculating statistics from it is easy. How about the other way around?

Let's say I know some variable has an average X, standard deviation Y and assume it has normal (Gaussian) distribution. What would be the best way to generate a "random" dataset (of arbitrary size) which will fit the distribution?

EDIT: This kind of develops from this question; I could make something based on that method, but I am wondering if there's a more efficient way to do it.

like image 358
quantumSoup Avatar asked Jul 08 '10 21:07

quantumSoup


People also ask

How do you generate data from mean and standard deviation?

You need to standarize it - substract mean and then divide by std deviation. Then You are free to transform this sample to Normal distribution with given parameters: multiply by std deviation and then add mean.

Can two distributions have the same mean and standard deviation?

Nevertheless, comparing means and standard deviations do not guarantee that the distributions are similar -- you may have two distributions with the same mean and standard deviation that, e.g., have different skewness and/or kurtosis.

Can normal distributions have different standard deviations?

All normal distributions, like the standard normal distribution, are unimodal and symmetrically distributed with a bell-shaped curve. However, a normal distribution can take on any value as its mean and standard deviation. In the standard normal distribution, the mean and standard deviation are always fixed.

What shows how the data deviates from the mean?

The standard deviation is calculated as the square root of variance by determining each data point's deviation relative to the mean. If the data points are further from the mean, there is a higher deviation within the data set; thus, the more spread out the data, the higher the standard deviation.


1 Answers

You can generate standard normal random variables with the Box-Mueller method. Then to transform that to have mean mu and standard deviation sigma, multiply your samples by sigma and add mu. I.e. for each z from the standard normal, return mu + sigma*z.

like image 113
John D. Cook Avatar answered Oct 20 '22 15:10

John D. Cook