Python difference between randn and normal

Tags:

I'm using the randn and normal functions from Python's numpy.random module. The functions are pretty similar from what I've read in the http://docs.scipy.org manual (they both concern the Gaussian distribution), but are there any subtler differences that I should be aware of? If so, in what situations would I be better off using a specific function?

774

asked Feb 12 '14 20:02

Medulla Oblongata

Video Answer

2 Answers

Description

Looking at the docs that you linked in your question, I'll highlight some of the key differences:

normal:

numpy.random.normal(loc=0.0, scale=1.0, size=None) # Draw random samples from a normal (Gaussian) distribution.  # Parameters :   # loc : float -- Mean (“centre”) of the distribution. # scale : float -- Standard deviation (spread or “width”) of the distribution. # size : tuple of ints -- Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn.

So in this case, you're generating a GENERIC normal distribution (more details on what that means later).

randn:

numpy.random.randn(d0, d1, ..., dn) # Return a sample (or samples) from the “standard normal” distribution.  # Parameters :   # d0, d1, ..., dn : int, optional -- The dimensions of the returned array, should be all positive. If no argument is given a single Python float is returned. # Returns :  # Z : ndarray or float -- A (d0, d1, ..., dn)-shaped array of floating-point samples from the standard normal distribution, or a single such float if no parameters were supplied.

In this case, you're generating a SPECIFIC normal distribution, the standard distribution.

(Brief) Math

Now some of the math, which is really needed to get at the heart of your question:

A normal distribution is a distribution where the values are more likely to occur near the mean value. There are a bunch of cases of this in nature. E.g., the average high temperature in Dallas in June is, let's say, 95 F. It might reach 100, or even 105 average in one year, but it more typically will be near 95 or 97. Similarly, it might reach as low as 80, but 85 or 90 is more likely.

So, it is fundamentally different from, say, a uniform distribution (rolling an honest 6-sided die).

A standard normal distribution is just a normal distribution where the average value is 0, and the variance (the mathematical term for the variation) is 1.

So,

numpy.random.normal(size= (10, 10))

is the exact same thing as writing

numpy.random.randn(10, 10)

because the default values (loc= 0, scale= 1) for numpy.random.normal are in fact the standard distribution.

History

To make matters more confusing, as the numpy random documentation states:

sigma * np.random.randn(...) + mu

is the same as

np.random.normal(loc= mu, scale= sigma, ...)

The problem is really specialization: in statistics, Gaussian distributions are so common that terminology cropped up to enable discussions:

Many distributions are Gaussain, so many that Gaussian became considered the normal distribution.
Good modeling, especially linear modeling, requires that all elements are "of the same size" (similar mean and variance). So it became standard practice to rescale distributions to mean=0 and variance=1.

*Final note: I used the term variance to mathematically describe variation. Some folks say standard deviation. Variance simply equals the square of standard deviation. Since the variance = 1 for the standard distribution, in this case of the standard distribution, variance == standard deviation.

answered Sep 21 '22 22:09

Mike Williamson

randn seems to give a distribution from some standardized normal distribution (mean 0 and variance 1). normal takes more parameters for more control. So randn seems to simply be a convenience function.

answered Sep 20 '22 22:09

M4rtini

Related questions
                            
                                Group by and find top n value_counts pandas
                            
                                The number of GET/POST parameters exceeded settings.DATA_UPLOAD_MAX_NUMBER_FIELDS
                            
                                Localized date strftime in Django view
                            
                                How to query directly the table created by Django for a ManyToMany relation?
                            
                                How to add group labels for bar charts in matplotlib
                            
                                How can I install lxml in docker
                            
                                Python packages hash not matching whilst installing using pip
                            
                                Python SQLite parameter substitution with wildcards in LIKE
                            
                                converting currency with $ to numbers in Python pandas
                            
                                Spark SQL Row_number() PartitionBy Sort Desc
                            
                                Delete an element in a JSON object
                            
                                How can I control what scalar form PyYAML uses for my data?
                            
                                How do I find out if a numpy array contains integers?
                            
                                How can I stop python.exe from closing immediately after I get an output? [duplicate]
                            
                                Use str.format() to access object attributes
                            
                                Problems installing python 3.6 with pyenv on Mac OS Big Sur
                            
                                ChoiceField doesn't display an empty label when using a tuple
                            
                                Cython Speed Boost vs. Usability [closed]
                            
                                How can I use the python HTMLParser library to extract data from a specific div tag?
                            
                                Fastest way to count number of occurrences in a Python list

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With