Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to generate a gaussian distribution using mysql user-defined function

I like to use MySQL to do quantitative analysis and statistics. I would like to make a MySQL user-defined function of the form: sample_gaussian(mean, stdev) that returns a single randomized value sampled from a gaussian distribution having mean and standard deviation of the user-entered arguments. MySQL already has a function rand() that returns a random number, so I just need to know some pseudocode for constraining/transforming that value so that it falls into the right distribution. Any suggestions?

BTW- This is my first stackoverflow question, so please forgive me if this question is asking too much of users on this site.

like image 294
ted.strauss Avatar asked Mar 02 '10 00:03

ted.strauss


People also ask

How do you define a Gaussian distribution?

Normal distribution, also known as the Gaussian distribution, is a probability distribution that is symmetric about the mean, showing that data near the mean are more frequent in occurrence than data far from the mean. In graphical form, the normal distribution appears as a "bell curve".

What is PDF of Gaussian distribution?

The PDF of a Gaussian random variable, z, is given by. where μ is the mean of the average value of z and σ is its standard deviation. The standard deviation squared, σ2, is called the variance of z.

Where does Gaussian distribution come from?

The term “Gaussian distribution” refers to the German mathematician Carl Friedrich Gauss, who first developed a two-parameter exponential function in 1809 in connection with studies of astronomical observation errors.


1 Answers

In answer to my own question, here is a MySQL user-defined function that returns a single random value sampled from a Gaussian distribution with a given mean and standard deviation.

DROP FUNCTION IF EXISTS gauss;
DELIMITER //
CREATE FUNCTION gauss(mean float, stdev float) RETURNS float
BEGIN
set @x=rand(), @y=rand();
set @gaus = ((sqrt(-2*log(@x))*cos(2*pi()*@y))*stdev)+mean;
return @gaus;
END
//
DELIMITER ;

To verify that this is in fact returning a Gaussian distribution, you can generate a series of these, then plot a histogram:

create temporary table temp (id int, rando float);
insert into temp (rando) select gauss(2,1); # repeat this operation 500 times
insert into temp (rando) select gauss(2,1) from any_table_with_500+_entries limit 500;
select round(temp,1), count(*) from temp group by round(temp,1) # creates a histogram

If you plot that histogram in excel or graphing tool of choice, you'll see the bell shaped normal curve.

like image 143
ted.strauss Avatar answered Sep 29 '22 01:09

ted.strauss