Here is a Python representation of a Neural Network Neuron that I'm trying to understand
class Network(object):
def __init__(self, sizes):
self.num_layers = len(sizes)
self.sizes = sizes
self.biases = [np.random.randn(y, 1) for y in sizes[1:]]
self.weights = [np.random.randn(y, x)
for x, y in zip(sizes[:-1], sizes[1:])]
Here is my current understanding :
self.num_layers = len(sizes)
: Return the number of items in sizesself.sizes = sizes
: assign self instance sizes to function parameter sizesself.biases = sizes
: generate an array of elements from the standard normal distribution (indicated by np.random.randn(y, 1)
)What is the following line computing?
self.weights = [np.random.randn(y, x)
for x, y in zip(sizes[:-1], sizes[1:])]
I'm new to Python. Can this code be used within a Python shell so I can gain a better understanding by invoking each line separately ?
The zip()
function pairs up elements from each iterable; zip('foo', 'bar')
for example, would produce [('f', 'b'), ('o', 'a'), ('o', 'r')]
; each element in the two strings has been paired up into three new tuples.
zip(sizes[:-1], sizes[1:])
then, creates pairs of elements in the sequence sizes
with the next element, because you pair up all elements except the last (sizes[:-1]
) with all elements except the first (sizes[1:]
). This pairs up the first and second element together, then the second and third, etc. all the way to the last two elements.
For each such pair a random sample is produced, using a list comprehension. So for each x, y
pair, a new 2-dimensional numpy matrix is produced with random values divided over y
rows and x
columns.
Note that the biases
value only uses sizes[1:]
, all but the first, to produce y
-by-1 matrices for each such size.
Quick demo of these concepts:
>>> zip('foo', 'bar')
[('f', 'b'), ('o', 'a'), ('o', 'r')]
>>> zip('foo', 'bar', 'baz') # you can add more sequences
[('f', 'b', 'b'), ('o', 'a', 'a'), ('o', 'r', 'z')]
>>> sizes = [5, 12, 18, 23, 42]
>>> zip(sizes[:-1], sizes[1:]) # a sliding window of pairs
[(5, 12), (12, 18), (18, 23), (23, 42)]
# 0, 1 .. 1, 2 .. 2, 3 .. 3, 4 element indices into sizes
>>>
self.weights = [np.random.randn(y, x) for x, y in zip(sizes[:-1], sizes[1:])]
will call the randn function with the parameters x, y that are the results of the operation zip(sizes[:-1], sizes[1:])
If we consider a list l=[1, 2, 3, 4]
l[:-1]
will return [1, 2, 3]
and l[1]
will give [2, 3, 4]
The zip operation on l[:-1], l[1]
will make the pairs [(1, 2), (2, 3), (3, 4)]
. Then, the pairs will be transmitted to the randn function
Of course, you can always type code in a python shell, it will give you a better understanding ;)
That is what is called list comprehension. You can create the same effect if you use a normal for
loop:
self.weights = []
for x, y in zip(sizes[:-1], sizes[1:]):
self.weights.append(np.random.randn(y, x))
Now with that loop, you can see that self.weights
is really just a bunch of np.random.randn(y, x)
's where y
and x
are defined for each x
and y
in zip(sizes[:-1], sizes[1:])
. You can just say that to yourself as you read the list comprehension: self.weights = [np.random.randn(y, x)) for x, y in zip(sizes[:-1], sizes[1:])]
. The word order finally makes sense. In case you didn't know, zip is a function that returns a list of tuples of each corresponding element in its arguments. For example, zip([1, 2, 3, 4], [4, 3, 2, 1])
would return [(1, 4), (2, 3), (3, 2), (4, 1)]
. (In Python3, actually it's a generator of tuples)
If you know C++, here is a conversion for self.weights = [np.random.randn(y, x) for x, y in zip(sizes[:-1], sizes[1:])]
to C++ that I made. It uses the Eigen C++ library instead of the Numpy Python library. You call it by typing Weights(weights, sizes);
in main(). The parameters for the function Weights consist of a pass by reference list of Matrices (weights) and a Vector (sizes). Pass by reference, marked by the '&', basically means that the value of weights will change in both the function and the main loop. This is different than pass by value because pass by value will only change the value of weights in the function. If you are trying to completely replicate this you will need to type #include <list>
, #include<Eigen/Dense>
, using namespace std;
and using namespace Eigen;
.
void Weights(list<MatrixXd> &weights, VectorXi sizes){
int x,y;
for(int i=0; i < sizes.rows()-1;i++){
y=sizes[i+1]; //sizes[1:]
x=sizes[i]; //sizes[:-1]
weights.push_back(MatrixXd::Random(y,x)); //np.random.randn(y,x)
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With