Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Theano broadcasting different to numpy's

Consider the following example of numpy broadcasting:

import numpy as np
import theano
from theano import tensor as T

xval = np.array([[1, 2, 3], [4, 5, 6]])
bval = np.array([[10, 20, 30]])
print xval + bval

As expected, the vector bval is added to each rows of the matrix xval and the output is:

[[11 22 33]
 [14 25 36]]

Trying to replicate the same behaviour in the git version of theano:

x = T.dmatrix('x')
b = theano.shared(bval)
z = x + b
f = theano.function([x], z)

print f(xval)

I get the following error:

ValueError: Input dimension mis-match. (input[0].shape[0] = 2, input[1].shape[0] = 1)
Apply node that caused the error: Elemwise{add,no_inplace}(x, <TensorType(int64, matrix)>)
Inputs types: [TensorType(float64, matrix), TensorType(int64, matrix)]
Inputs shapes: [(2, 3), (1, 3)]
Inputs strides: [(24, 8), (24, 8)]
Inputs scalar values: ['not scalar', 'not scalar']

I understand Tensor objects such as x have a broadcastable attribute, but I can't find a way to 1) set this correctly for the shared object or 2) have it correctly inferred. How can I re-implement numpy's behaviour in theano?

like image 454
mbatchkarov Avatar asked Oct 26 '14 14:10

mbatchkarov


People also ask

What is NumPy broadcasting?

Numpy with Python The term broadcasting refers to the ability of NumPy to treat arrays of different shapes during arithmetic operations. Arithmetic operations on arrays are usually done on corresponding elements. If two arrays are of exactly the same shape, then these operations are smoothly performed.

How does python broadcasting work?

The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes.


1 Answers

Theano need all broadcastable dimensions to be declared in the graph before compilation. NumPy use the run time shape information.

By default, all shared variable dimsions aren't broadcastable, as their shape could change.

To create the shared variable with the broadcastable dimension that you need in your example:

b = theano.shared(bval, broadcastable=(True,False))

I'll add this information to the documentation.

like image 183
nouiz Avatar answered Sep 27 '22 17:09

nouiz