Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Differences between numpy.random.rand vs numpy.random.randn in Python

What are the differences between numpy.random.rand and numpy.random.randn?

From the documentation, I know the only difference between them is the probabilistic distribution each number is drawn from, but the overall structure (dimension) and data type used (float) is the same. I have a hard time debugging a neural network because of this.

Specifically, I am trying to re-implement the Neural Network provided in the Neural Network and Deep Learning book by Michael Nielson. The original code can be found here. My implementation was the same as the original; however, I instead defined and initialized weights and biases with numpy.random.rand in the init function, rather than the numpy.random.randn function as shown in the original.

However, my code that uses random.rand to initialize weights and biases does not work. The network won't learn and the weights and biases will not change.

What is the difference(s) between the two random functions that cause this weirdness?

like image 805
Phúc Lê Avatar asked Nov 11 '17 16:11

Phúc Lê


People also ask

What is NumPy random randn?

NumPy random. randn() function in Python is used to return random values from the normal distribution in a specified shape. This function creates an array of the given shape and it fills with random samples from the normal standard distribution.

How do you use the random rand in Python?

Use randint() Generate random integerUse a random. randint() function to get a random integer number from the inclusive range. For example, random. randint(0, 10) will return a random number from [0, 1, 2, 3, 4, 5, 6, 7, 8 ,9, 10].


2 Answers

First, as you see from the documentation numpy.random.randn generates samples from the normal distribution, while numpy.random.rand from a uniform distribution (in the range [0,1)).

Second, why did the uniform distribution not work? The main reason is the activation function, especially in your case where you use the sigmoid function. The plot of the sigmoid looks like the following:

enter image description here

So you can see that if your input is away from 0, the slope of the function decreases quite fast and as a result you get a tiny gradient and tiny weight update. And if you have many layers - those gradients get multiplied many times in the back pass, so even "proper" gradients after multiplications become small and stop making any influence. So if you have a lot of weights which bring your input to those regions you network is hardly trainable. That's why it is a usual practice to initialize network variables around zero value. This is done to ensure that you get reasonable gradients (close to 1) to train your net.

However, uniform distribution is not something completely undesirable, you just need to make the range smaller and closer to zero. As one of good practices is using Xavier initialization. In this approach you can initialize your weights with:

  1. Normal distribution. Where mean is 0 and var = sqrt(2. / (in + out)), where in - is the number of inputs to the neurons and out - number of outputs.

  2. Uniform distribution in range [-sqrt(6. / (in + out)), +sqrt(6. / (in + out))]

like image 95
asakryukin Avatar answered Sep 22 '22 18:09

asakryukin


  • np.random.rand is for Uniform distribution (in the half-open interval [0.0, 1.0))
  • np.random.randn is for Standard Normal (aka. Gaussian) distribution (mean 0 and variance 1)

You can visually explore the differences between these two very easily:

import numpy as np import matplotlib.pyplot as plt  sample_size = 100000 uniform = np.random.rand(sample_size) normal = np.random.randn(sample_size)  pdf, bins, patches = plt.hist(uniform, bins=20, range=(0, 1), density=True) plt.title('rand: uniform') plt.show()  pdf, bins, patches = plt.hist(normal, bins=20, range=(-4, 4), density=True) plt.title('randn: normal') plt.show() 

Which produce:

enter image description here

and

enter image description here

like image 41
YaOzI Avatar answered Sep 21 '22 18:09

YaOzI