"Reverse" statistics: generating data based on mean and standard deviation

Tags:

statistics

Having a dataset and calculating statistics from it is easy. How about the other way around?

Let's say I know some variable has an average X, standard deviation Y and assume it has normal (Gaussian) distribution. What would be the best way to generate a "random" dataset (of arbitrary size) which will fit the distribution?

EDIT: This kind of develops from this question; I could make something based on that method, but I am wondering if there's a more efficient way to do it.

358

asked Jul 08 '10 21:07

quantumSoup

1 Answers

You can generate standard normal random variables with the Box-Mueller method. Then to transform that to have mean mu and standard deviation sigma, multiply your samples by sigma and add mu. I.e. for each z from the standard normal, return mu + sigma*z.

113

answered Oct 20 '22 15:10

John D. Cook

Related questions
                            
                                Does a copy constructor/operator/function need to make clear which copy variant it implements?
                            
                                Path generation for non-intersecting disc movement on a plane
                            
                                How can I find words in matrix of letters
                            
                                An algorithm to sort a list of values into n groups so that the sum of each group is as close as possible
                            
                                Intercepting the Fn key on laptops
                            
                                Basis for claim that the number of bugs per line of code is constant regardless of the language used
                            
                                What are some good approaches to predicting the completion time of a long process?
                            
                                Which Regular Expression Algorithm does Javascript use for Regex?
                            
                                Is there anything wrong with taking immediate actions in constructors?
                            
                                Why should I use UTC?
                            
                                Why is type inference impractical for object oriented languages?
                            
                                Is regex case insensitivity slower?
                            
                                On complexity of recursive descent parsers
                            
                                How does a read-write mutex/lock work?
                            
                                Time complexity for a very complicated recursion code
                            
                                Should we compare floating point numbers for equality against a *relative* error?
                            
                                Algorithm for diameter of graph?
                            
                                How to detect duplicate data?
                            
                                Is a For Loop always executed at least once?
                            
                                Should I sign a non-competition clause in freelance contract? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With