I have two vectors v
and w
and I want to make a matrix m
out of them such that:
m[i, j] = v[i] * w[j]
In other words I want to calculate the outer product of them. I can do it either by using theano.tensor.outer
or by adding new indexes to v
and v
and using the dot
product.
m = T.dot(v[:,numpy.newaxis], w[numpy.newaxis,:])
Now, I try to solve a bit more general problem. Instead of two vectors v
and w
I have two matrices (I call them v
and w
again) and I would like to calculate an outer product of each row from matrix v
with the correspondent row of matrix w
(i_th row in the first matrix should be multiplied with the i_th row of the second matrix). So, I would like to do something like that:
m1 = T.tensordot(v[:,:, numpy.newaxis], w[:,:,numpy.newaxis], axes = [[2],[2]])
m[i, j, k] = m1[i, k, j, k]
In other words, m[:,:,k]
is the matrix corresponding to outer product of k_th
row from the matrix v
and k_th
row of the matrix w
.
I see two problems with the above "solution". First, it is not really a solution, since the second line of the code is not a proper theano code. So, my first question is how to do this "advanced slicing" by forcing some indexes to be equal. For example m[i, k] = a[i, k, i, i, k]
. Second, I do not like the fact that I first create a 4D tesnor (m1
) from two 2D tensors and then I reduce it back to a 3D tensor. It can be very memory consuming. I guess one can avoid it.
We need to introduce broadcastable dimensions into the two input matrices with dimshuffle
and then let broadcasting
take care of the elementwise multiplication resulting in outer-product between coresponding rows of them.
Thus, with V
and W
as the theano matrices, simply do -
V.dimshuffle(0, 1, 'x')*W.dimshuffle(0, 'x', 1)
In NumPy
, we have np.newaxis
to extend dimensions and np.transpose()
for permuting dimensions. With theno
, there's dimshuffle
to do both of these tasks with a mix of listing dimension IDs and x
's for introducing new broadcast-able axes.
Sample run
1) Inputs :
# Numpy arrays
In [121]: v = np.random.randint(11,99,(3,4))
...: w = np.random.randint(11,99,(3,5))
...:
# Perform outer product on corresponding rows in inputs
In [122]: for i in range(v.shape[0]):
...: print(np.outer(v[i],w[i]))
...:
[[2726 1972 1740 2117 1972]
[8178 5916 5220 6351 5916]
[7520 5440 4800 5840 5440]
[8648 6256 5520 6716 6256]]
[[8554 3458 8918 4186 4277]
[1786 722 1862 874 893]
[8084 3268 8428 3956 4042]
[2444 988 2548 1196 1222]]
[[2945 2232 1209 372 682]
[2565 1944 1053 324 594]
[7125 5400 2925 900 1650]
[6840 5184 2808 864 1584]]
2) Theano part :
# Get to theano : Get the theano matrix versions
In [123]: V = T.matrix('v')
...: W = T.matrix('w')
...:
# Use proposed code
In [124]: OUT = V.dimshuffle(0, 1, 'x')*W.dimshuffle(0, 'x', 1)
# Create a function out of it and then use on input NumPy arrays
In [125]: f = function([V,W], OUT)
3) Verify results :
In [126]: f(v,w) # Verify results against the earlier loopy results
Out[126]:
array([[[ 2726., 1972., 1740., 2117., 1972.],
[ 8178., 5916., 5220., 6351., 5916.],
[ 7520., 5440., 4800., 5840., 5440.],
[ 8648., 6256., 5520., 6716., 6256.]],
[[ 8554., 3458., 8918., 4186., 4277.],
[ 1786., 722., 1862., 874., 893.],
[ 8084., 3268., 8428., 3956., 4042.],
[ 2444., 988., 2548., 1196., 1222.]],
[[ 2945., 2232., 1209., 372., 682.],
[ 2565., 1944., 1053., 324., 594.],
[ 7125., 5400., 2925., 900., 1650.],
[ 6840., 5184., 2808., 864., 1584.]]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With