Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to apply outer product for tensors without unnecessary increase of dimensions?

I have two vectors v and w and I want to make a matrix m out of them such that:

m[i, j] = v[i] * w[j]

In other words I want to calculate the outer product of them. I can do it either by using theano.tensor.outer or by adding new indexes to v and v and using the dot product.

m = T.dot(v[:,numpy.newaxis], w[numpy.newaxis,:])

Now, I try to solve a bit more general problem. Instead of two vectors v and w I have two matrices (I call them v and w again) and I would like to calculate an outer product of each row from matrix v with the correspondent row of matrix w (i_th row in the first matrix should be multiplied with the i_th row of the second matrix). So, I would like to do something like that:

m1 = T.tensordot(v[:,:, numpy.newaxis], w[:,:,numpy.newaxis], axes = [[2],[2]])
m[i, j, k] = m1[i, k, j, k]

In other words, m[:,:,k] is the matrix corresponding to outer product of k_th row from the matrix v and k_th row of the matrix w.

I see two problems with the above "solution". First, it is not really a solution, since the second line of the code is not a proper theano code. So, my first question is how to do this "advanced slicing" by forcing some indexes to be equal. For example m[i, k] = a[i, k, i, i, k]. Second, I do not like the fact that I first create a 4D tesnor (m1) from two 2D tensors and then I reduce it back to a 3D tensor. It can be very memory consuming. I guess one can avoid it.

like image 850
Roman Avatar asked Feb 07 '17 16:02

Roman


1 Answers

We need to introduce broadcastable dimensions into the two input matrices with dimshuffle and then let broadcasting take care of the elementwise multiplication resulting in outer-product between coresponding rows of them.

Thus, with V and W as the theano matrices, simply do -

V.dimshuffle(0, 1, 'x')*W.dimshuffle(0, 'x', 1)

In NumPy, we have np.newaxis to extend dimensions and np.transpose() for permuting dimensions. With theno, there's dimshuffle to do both of these tasks with a mix of listing dimension IDs and x's for introducing new broadcast-able axes.

Sample run

1) Inputs :

# Numpy arrays
In [121]: v = np.random.randint(11,99,(3,4))
     ...: w = np.random.randint(11,99,(3,5))
     ...: 

# Perform outer product on corresponding rows in inputs
In [122]: for i in range(v.shape[0]):
     ...:     print(np.outer(v[i],w[i]))
     ...:     
[[2726 1972 1740 2117 1972]
 [8178 5916 5220 6351 5916]
 [7520 5440 4800 5840 5440]
 [8648 6256 5520 6716 6256]]
[[8554 3458 8918 4186 4277]
 [1786  722 1862  874  893]
 [8084 3268 8428 3956 4042]
 [2444  988 2548 1196 1222]]
[[2945 2232 1209  372  682]
 [2565 1944 1053  324  594]
 [7125 5400 2925  900 1650]
 [6840 5184 2808  864 1584]]

2) Theano part :

# Get to theano : Get the theano matrix versions 
In [123]: V = T.matrix('v')
     ...: W = T.matrix('w')
     ...: 

# Use proposed code
In [124]: OUT = V.dimshuffle(0, 1, 'x')*W.dimshuffle(0, 'x', 1)

# Create a function out of it and then use on input NumPy arrays
In [125]: f = function([V,W], OUT)

3) Verify results :

In [126]: f(v,w)    # Verify results against the earlier loopy results
Out[126]: 
array([[[ 2726.,  1972.,  1740.,  2117.,  1972.],
        [ 8178.,  5916.,  5220.,  6351.,  5916.],
        [ 7520.,  5440.,  4800.,  5840.,  5440.],
        [ 8648.,  6256.,  5520.,  6716.,  6256.]],

       [[ 8554.,  3458.,  8918.,  4186.,  4277.],
        [ 1786.,   722.,  1862.,   874.,   893.],
        [ 8084.,  3268.,  8428.,  3956.,  4042.],
        [ 2444.,   988.,  2548.,  1196.,  1222.]],

       [[ 2945.,  2232.,  1209.,   372.,   682.],
        [ 2565.,  1944.,  1053.,   324.,   594.],
        [ 7125.,  5400.,  2925.,   900.,  1650.],
        [ 6840.,  5184.,  2808.,   864.,  1584.]]])
like image 96
Divakar Avatar answered Oct 23 '22 14:10

Divakar