Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Partitioned matrix multiplication in tensorflow or pytorch

Assume I have matrices P with the size [4, 4] which partitioned (block) into 4 smaller matrices [2,2]. How can I efficiently multiply this block-matrix into another matrix (not partitioned matrix but smaller)?

Let's Assume our original matric is:

P = [ 1 1 2 2
      1 1 2 2
      3 3 4 4
      3 3 4 4]

Which split into submatrices:

P_1 = [1 1    , P_2 = [2 2  , P_3 = [3 3   P_4 = [4 4
       1 1]            2 2]          3 3]         4 4]

Now our P is:

P = [P_1 P_2
     P_3 p_4]

In the next step, I want to do element-wise multiplication between P and smaller matrices which its size is equal to number of sub-matrices:

P * [ 1 0   =   [P_1  0  = [1 1 0 0 
      0 0 ]      0    0]    1 1 0 0
                            0 0 0 0
                            0 0 0 0]    
like image 228
Amir Avatar asked Jun 14 '19 15:06

Amir


2 Answers

You can think of representing your large block matrix in a more efficient way.

For instance, a block matrix

P = [ 1 1 2 2
      1 1 2 2
      3 3 4 4
      3 3 4 4]

Can be represented using

a = [ 1 0    b = [ 1 1 0 0    p = [ 1 2
      1 0          0 0 1 1 ]        3 4 ]
      0 1
      0 1 ]

As

P = a @ p @ b

With (@ representing matrix multiplication). Matrices a and b represents/encode the block structure of P and the small p represents the values of each block.

Now, if you want to multiply (element-wise) p with a small (2x2) matrix q you simply

a @ (p * q) @ b

A simple pytorch example

In [1]: a = torch.tensor([[1., 0], [1., 0], [0., 1], [0, 1]])
In [2]: b = torch.tensor([[1., 1., 0, 0], [0, 0, 1., 1]]) 
In [3]: p=torch.tensor([[1., 2.], [3., 4.]])
In [4]: q = torch.tensor([[1., 0], [0., 0]])

In [5]: a @ p @ b
Out[5]:
tensor([[1., 1., 2., 2.],
        [1., 1., 2., 2.],
        [3., 3., 4., 4.],
        [3., 3., 4., 4.]])
In [6]: a @ (p*q) @ b
Out[6]:
tensor([[1., 1., 0., 0.],
        [1., 1., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

I leave it to you as an exercise how to efficiently produce the "structure" matrices a and b given the sizes of the blocks.

like image 200
Shai Avatar answered Nov 13 '22 01:11

Shai


Following is a general Tensorflow-based solution that works for input matrices p (large) and m (small) of arbitrary shapes as long as the sizes of p are divisible by the sizes of m on both axes.

def block_mul(p, m):
   p_x, p_y = p.shape
   m_x, m_y = m.shape
   m_4d = tf.reshape(m, (m_x, 1, m_y, 1))
   m_broadcasted = tf.broadcast_to(m_4d, (m_x, p_x // m_x, m_y, p_y // m_y))
   mp = tf.reshape(m_broadcasted, (p_x, p_y))
   return p * mp

Test:

import tensorflow as tf

tf.enable_eager_execution()

p = tf.reshape(tf.constant(range(36)), (6, 6))
m = tf.reshape(tf.constant(range(9)), (3, 3))
print(f"p:\n{p}\n")
print(f"m:\n{m}\n")
print(f"block_mul(p, m):\n{block_mul(p, m)}")

Output (Python 3.7.3, Tensorflow 1.13.1):

p:
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]
 [24 25 26 27 28 29]
 [30 31 32 33 34 35]]

m:
[[0 1 2]
 [3 4 5]
 [6 7 8]]

block_mul(p, m):
[[  0   0   2   3   8  10]
 [  0   0   8   9  20  22]
 [ 36  39  56  60  80  85]
 [ 54  57  80  84 110 115]
 [144 150 182 189 224 232]
 [180 186 224 231 272 280]]

Another solution that uses implicit broadcasting is the following:

def block_mul2(p, m):
   p_x, p_y = p.shape
   m_x, m_y = m.shape
   p_4d = tf.reshape(p, (m_x, p_x // m_x, m_y, p_y // m_y))
   m_4d = tf.reshape(m, (m_x, 1, m_y, 1))
   return tf.reshape(p_4d * m_4d, (p_x, p_y))
like image 30
GZ0 Avatar answered Nov 13 '22 00:11

GZ0