Code: <pre class="prettyprint"><code>x = tf.constant([1.,2.,3.], shape = (3,2,4)) y = tf.constant([1.,2.,3.], shape = (3,21,4)) tf.matmul(x,y) # Doesn't work. tf.matmul(x,y,transpose_b = True) # This works. Shape is (3,2,21) tf.matmul(x,tf.transpose(y)) # Doesn't work. </code></pre> I want to know what shape <code>y</code> becomes inside <code>tf.matmul(x,y,transpose_b = True)</code> so I can work out what is really going on inside an LSTM with attention.

Transpose can be defined differently for tensors of rank > 2, and here the difference is in axes that are transposed by <code>tf.transpose</code> and <code>tf.matmul(..., transpose_b=True)</code>. By default, <code>tf.transpose</code> does this: <blockquote> The returned tensor's dimension <code>i</code> will correspond to the input dimension <code>perm[i]</code>. If perm is not given, it is set to <code>(n-1...0)</code>, where n is the rank of the input tensor. Hence by default, this operation performs a regular matrix transpose on 2-D input Tensors. </blockquote> So in your case, it's going to transform <code>y</code> into a tensor of shape <code>(4, 21, 3)</code>, which is not compatible with <code>x</code> (see below). But if you set <code>perm=[0, 2, 1]</code>, the result is compatible: <pre class="prettyprint"><code># Works! (3, 2, 4) * (3, 4, 21) -> (3, 2, 21). tf.matmul(x, tf.transpose(y, [0, 2, 1])) </code></pre> <hr> <h3>About <code>tf.matmul</code> </h3> You can compute the dot product: <code>(a, b, c) * (a, c, d) -> (a, b, d)</code>. But it's not tensor dot product -- it's a batch operation (see this question). In this case, <code>a</code> is considered a batch size, so <code>tf.matmul</code> computes <code>a</code> dot-products of matrices <code>(b, c) * (c, d)</code>. Batch can be more than one dimension, so this is also valid: <pre class="prettyprint"><code>(a, b, c, d) * (a, b, d, e) -> (a, b, c, e) </code></pre>

Why does tf.matmul(a,b, transpose_b=True) work, but not tf.matmul(a, tf.transpose(b))?

Tags:

python

tensorflow

deep-learning

linear-algebra

matrix-multiplication

Code:

x = tf.constant([1.,2.,3.], shape = (3,2,4))
y = tf.constant([1.,2.,3.], shape = (3,21,4))
tf.matmul(x,y)                     # Doesn't work. 
tf.matmul(x,y,transpose_b = True)  # This works. Shape is (3,2,21)
tf.matmul(x,tf.transpose(y))       # Doesn't work.

I want to know what shape y becomes inside tf.matmul(x,y,transpose_b = True) so I can work out what is really going on inside an LSTM with attention.

506

asked Jan 04 '18 17:01

user3933614

1 Answers

Transpose can be defined differently for tensors of rank > 2, and here the difference is in axes that are transposed by tf.transpose and tf.matmul(..., transpose_b=True).

By default, tf.transpose does this:

The returned tensor's dimension i will correspond to the input dimension perm[i]. If perm is not given, it is set to (n-1...0), where n is the rank of the input tensor. Hence by default, this operation performs a regular matrix transpose on 2-D input Tensors.

So in your case, it's going to transform y into a tensor of shape (4, 21, 3), which is not compatible with x (see below).

But if you set perm=[0, 2, 1], the result is compatible:

# Works! (3, 2, 4) * (3, 4, 21) -> (3, 2, 21).
tf.matmul(x, tf.transpose(y, [0, 2, 1]))

About `tf.matmul`

You can compute the dot product: (a, b, c) * (a, c, d) -> (a, b, d). But it's not tensor dot product -- it's a batch operation (see this question).

In this case, a is considered a batch size, so tf.matmul computes a dot-products of matrices (b, c) * (c, d).

Batch can be more than one dimension, so this is also valid:

(a, b, c, d) * (a, b, d, e) -> (a, b, c, e)

144

answered Nov 15 '22 05:11

Maxim

Related questions
                            
                                ValueError: Shape must be rank 1 but is rank 0 for 'ROIAlign/Crop' (op: 'CropAndResize') with input shapes: [2,360,475,3], [1,4], [], [2]
                            
                                Keyboard interrupt tensorflow run and save at that point
                            
                                Symbolic integration slow with SymPy
                            
                                Where is `*` documented in tensorflow?
                            
                                More Complex Syntax Highlighting for Python in PyCharm?
                            
                                How do I configure Aptana IDE (Eclipse) to work with pipenv?
                            
                                In what order, OpenCV's HoughLines lists the detected lines in the [rho,theta] matrix?
                            
                                Pandas DataFrame : groupby then transpose
                            
                                Split list into N lists, and assign each list to a worker in multithreading
                            
                                Analytic intersection between two cubic expressions
                            
                                Compare elements of one nested list with another nested list
                            
                                Insert Data to SQL Server Table using pymssql
                            
                                Unit testing Flask app running under uwsgi
                            
                                Multi-input models using Keras (Model API)
                            
                                Multiprocessing of a function on a pandas dataframe
                            
                                Why does this neural network learn nothing?
                            
                                Apache Flask Error 13, permission denied [duplicate]
                            
                                Use of functools.partial in a decorator that attaches function as attribute of object
                            
                                What is Debugger PIN when I run the flask app python
                            
                                Is collections.defaultdict thread-safe?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why does tf.matmul(a,b, transpose_b=True) work, but not tf.matmul(a, tf.transpose(b))?

Tags:

python

tensorflow

deep-learning

linear-algebra

matrix-multiplication

user3933614

People also ask

1 Answers

About `tf.matmul`

Maxim

Recent Activity

Donate For Us

Why does tf.matmul(a,b, transpose_b=True) work, but not tf.matmul(a, tf.transpose(b))?

Tags:

python

tensorflow

deep-learning

linear-algebra

matrix-multiplication

user3933614

People also ask

1 Answers

About tf.matmul

Maxim

Related questions

Recent Activity

Donate For Us

About `tf.matmul`