Python beginner, understanding some code

Question

Here is a Python representation of a Neural Network Neuron that I'm trying to understand

class Network(object):

    def __init__(self, sizes):
        self.num_layers = len(sizes)
        self.sizes = sizes
        self.biases = [np.random.randn(y, 1) for y in sizes[1:]]
        self.weights = [np.random.randn(y, x) 
                        for x, y in zip(sizes[:-1], sizes[1:])]

Here is my current understanding :

self.num_layers = len(sizes): Return the number of items in sizes
self.sizes = sizes: assign self instance sizes to function parameter sizes
self.biases = sizes: generate an array of elements from the standard normal distribution (indicated by np.random.randn(y, 1))

What is the following line computing?

self.weights = [np.random.randn(y, x)
    for x, y in zip(sizes[:-1], sizes[1:])]

I'm new to Python. Can this code be used within a Python shell so I can gain a better understanding by invoking each line separately ?

Martijn Pieters · Accepted Answer

The zip() function pairs up elements from each iterable; zip('foo', 'bar') for example, would produce [('f', 'b'), ('o', 'a'), ('o', 'r')]; each element in the two strings has been paired up into three new tuples.

zip(sizes[:-1], sizes[1:]) then, creates pairs of elements in the sequence sizes with the next element, because you pair up all elements except the last (sizes[:-1]) with all elements except the first (sizes[1:]). This pairs up the first and second element together, then the second and third, etc. all the way to the last two elements.

For each such pair a random sample is produced, using a list comprehension. So for each x, y pair, a new 2-dimensional numpy matrix is produced with random values divided over y rows and x columns.

Note that the biases value only uses sizes[1:], all but the first, to produce y-by-1 matrices for each such size.

Quick demo of these concepts:

>>> zip('foo', 'bar')
[('f', 'b'), ('o', 'a'), ('o', 'r')]
>>> zip('foo', 'bar', 'baz')  # you can add more sequences
[('f', 'b', 'b'), ('o', 'a', 'a'), ('o', 'r', 'z')]
>>> sizes = [5, 12, 18, 23, 42]
>>> zip(sizes[:-1], sizes[1:])  # a sliding window of pairs
[(5, 12), (12, 18), (18, 23), (23, 42)]
# 0, 1 ..  1,  2 ..  2,  3 ..  3,  4   element indices into sizes
>>>

Alexis Clarembeau · Answer

self.weights = [np.random.randn(y, x) for x, y in zip(sizes[:-1], sizes[1:])] will call the randn function with the parameters x, y that are the results of the operation zip(sizes[:-1], sizes[1:])

If we consider a list l=[1, 2, 3, 4] l[:-1] will return [1, 2, 3] and l[1] will give [2, 3, 4] The zip operation on l[:-1], l[1] will make the pairs [(1, 2), (2, 3), (3, 4)]. Then, the pairs will be transmitted to the randn function

Of course, you can always type code in a python shell, it will give you a better understanding ;)

zondo · Answer

That is what is called list comprehension. You can create the same effect if you use a normal for loop:

self.weights = []
for x, y in zip(sizes[:-1], sizes[1:]):
    self.weights.append(np.random.randn(y, x))

Now with that loop, you can see that self.weights is really just a bunch of np.random.randn(y, x)'s where y and x are defined for each x and y in zip(sizes[:-1], sizes[1:]). You can just say that to yourself as you read the list comprehension: self.weights = [np.random.randn(y, x)) for x, y in zip(sizes[:-1], sizes[1:])]. The word order finally makes sense. In case you didn't know, zip is a function that returns a list of tuples of each corresponding element in its arguments. For example, zip([1, 2, 3, 4], [4, 3, 2, 1]) would return [(1, 4), (2, 3), (3, 2), (4, 1)]. (In Python3, actually it's a generator of tuples)

Zac Schulwolf · Answer

If you know C++, here is a conversion for self.weights = [np.random.randn(y, x) for x, y in zip(sizes[:-1], sizes[1:])] to C++ that I made. It uses the Eigen C++ library instead of the Numpy Python library. You call it by typing Weights(weights, sizes); in main(). The parameters for the function Weights consist of a pass by reference list of Matrices (weights) and a Vector (sizes). Pass by reference, marked by the '&', basically means that the value of weights will change in both the function and the main loop. This is different than pass by value because pass by value will only change the value of weights in the function. If you are trying to completely replicate this you will need to type #include <list>, #include<Eigen/Dense>, using namespace std; and using namespace Eigen;.

void Weights(list<MatrixXd> &weights, VectorXi sizes){ 
    int x,y; 
    for(int i=0; i < sizes.rows()-1;i++){
        y=sizes[i+1]; //sizes[1:]
        x=sizes[i]; //sizes[:-1]
        weights.push_back(MatrixXd::Random(y,x)); //np.random.randn(y,x)
    }
}

Python beginner, understanding some code

Tags:

python

numpy

blue-sky

4 Answers

Martijn Pieters

Alexis Clarembeau

zondo

Zac Schulwolf

Recent Activity

Donate For Us

Python beginner, understanding some code

Tags:

python

numpy

blue-sky

4 Answers

Martijn Pieters

Alexis Clarembeau

zondo

Zac Schulwolf

Related questions

Recent Activity

Donate For Us