Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow custom layer: Creating a sparse matrix with trainable parameters

A model that I am working on should be predicting quite a lot of variables simultaneously (>1000). Therefore I would like to have a small neural network at the end of the network for each output.

In order to do this compactly, I would like to find a way to create a sparse trainable connection between two layers in the neural network within the Tensorflow framework.

Only a small portion of the connection matrix should be trainable: It is only the parameters that are part of the block-diagonal.


For example: see the Not dense part

The connection matrix is the following:

Block diagonal matrix

The trainable parameters should be in the place of the 1's.

like image 930
Marko Karbevski Avatar asked Oct 16 '22 10:10

Marko Karbevski


2 Answers

edit so the comment was Is this a trainable object though?

The answer: No. You cannot use sparse matrix currently and make it trainable. Instead you can use a mask matrix (see at the end)

But if you need to use sparse matrix, you just have to use tf.sparse.sparse_dense_matmul() or tf.sparse_tensor_to_dense() where your sparse interacts with a dense matrix. I have taken a simple XOR example from here and replaced dense with a sparse matrix:

#Declaring necessary modules
import tensorflow as tf
import numpy as np
"""
A simple numpy implementation of a XOR gate to understand the backpropagation
algorithm
"""

x = tf.placeholder(tf.float32,shape = [4,2],name = "x")
#declaring a place holder for input x
y = tf.placeholder(tf.float32,shape = [4,1],name = "y")
#declaring a place holder for desired output y

m = np.shape(x)[0]#number of training examples
n = np.shape(x)[1]#number of features
hidden_s = 2 #number of nodes in the hidden layer
l_r = 1#learning rate initialization

theta1 = tf.SparseTensor(indices=[[0, 0],[0, 1], [1, 1]], values=[0.1, 0.2, 0.1], dense_shape=[3, 2])
#theta1 = tf.cast(tf.Variable(tf.random_normal([3,hidden_s]),name = "theta1"),tf.float64)
theta2 = tf.cast(tf.Variable(tf.random_normal([hidden_s+1,1]),name = "theta2"),tf.float32)

#conducting forward propagation
a1 = tf.concat([np.c_[np.ones(x.shape[0])],x],1)
#the weights of the first layer are multiplied by the input of the first layer

#z1 = tf.sparse_tensor_dense_matmul(theta1, a1)

z1 = tf.matmul(a1,tf.sparse_tensor_to_dense(theta1))
#the input of the second layer is the output of the first layer, passed through the 

a2 = tf.concat([np.c_[np.ones(x.shape[0])],tf.sigmoid(z1)],1)
#the input of the second layer is multiplied by the weights

z3 = tf.matmul(a2,theta2)
#the output is passed through the activation function to obtain the final probability

h3 = tf.sigmoid(z3)
cost_func = -tf.reduce_sum(y*tf.log(h3)+(1-y)*tf.log(1-h3),axis = 1)

#built in tensorflow optimizer that conducts gradient descent using specified 

optimiser = tf.train.GradientDescentOptimizer(learning_rate = l_r).minimize(cost_func)

#setting required X and Y values to perform XOR operation
X = [[0,0],[0,1],[1,0],[1,1]]
Y = [[0],[1],[1],[0]]

#initializing all variables, creating a session and running a tensorflow session
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

#running gradient descent for each iterati
for i in range(200):
   sess.run(optimiser, feed_dict = {x:X,y:Y})#setting place holder values using feed_dict
   if i%100==0:
      print("Epoch:",i)
      print(sess.run(theta1))

and the output is:

Epoch: 0
SparseTensorValue(indices=array([[0, 0],
       [0, 1],
       [1, 1]]), values=array([0.1, 0.2, 0.1], dtype=float32), dense_shape=array([3, 2]))
Epoch: 100
SparseTensorValue(indices=array([[0, 0],
       [0, 1],
       [1, 1]]), values=array([0.1, 0.2, 0.1], dtype=float32), dense_shape=array([3, 2]))

So the only way is to use a mask matrix. You can use it by multiplication or tf.where

1) Multiplication: You can create mask matrix of the desired shape and multiply it with your weight matrix:

mask = tf.Variable([[1,0,0],[0,1,0],[0,0,1]],name ='mask', trainable=False)
weight = tf.cast(tf.Variable(tf.random_normal([3,3])),tf.float32)
desired_tensor = tf.matmul(weight, mask)

2) tf.where

mask = tf.Variable([[1,0,0],[0,1,0],[0,0,1]],name ='mask', trainable=False)
weight = tf.cast(tf.Variable(tf.random_normal([3,3])),tf.float32)
desired_tensor = tf.where(mask > 0, tf.ones_like(weight), weight)

Hope it helps


You can do that by using sparse tensors like so:

SparseTensor(indices=[[0, 0], [1, 2]], values=[1, 2], dense_shape=[3, 4])

and the output is:

[[1, 0, 0, 0]
 [0, 0, 2, 0]
 [0, 0, 0, 0]]

you can look up more on the documentation of sparse tensor here:

https://www.tensorflow.org/api_docs/python/tf/sparse/SparseTensor

Hope it helps!

like image 129
eugen Avatar answered Nov 15 '22 10:11

eugen


I have written exactly such a layer:

https://github.com/ArnovanHilten/GenNet/blob/master/GenNet_utils/LocallyDirectedConnected_tf2.py

It takes a sparse matrix as an input and lets you decide how to connect between layers. The layer uses sparse tensors and matrix multiplications.

like image 36
Arno Avatar answered Nov 15 '22 10:11

Arno