Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

make input features map from expansion tensor in keras

I used Taylor expansion in image classification task. Basically, firstly, pixel vector is generated from RGB image, and each pixel values from pixel vector is going to approximated with Taylor series expansion of sin(x). In tensorflow implementation, I tried possible of coding up this with tensorflow, and I still have some problem when I tried to create feature maps by stacking tensor with expansion terms. Can anyone provide possible perspective how can I make my current attempt more efficient? Any possible thoughts?

Here is the expansion terms of Taylor series of sin(x):

Taylor expansion of sin(x)

here is my current attempt:

term = 2
c = tf.constant([1, -1/6])
power = tf.constant([1, 3])

x = tf.keras.Input(shape=(32, 32, 3))
res =[]
for x in range(term):
    expansion = c * tf.math.pow(tf.tile(x[..., None], [1, 1, 1, 1, term]),power)
    m_ij = tf.math.cumsum(expansion, axis=-1)
    res.append(m_i)

but this is not quite working because I want to create input features maps from each expansion neurons, delta_1, delta_2 needs to be stacked, which I didn't make correctly in my above attempt, and my code is not well generalized also. How can I refine my above coding attempts in correct way of implementation? Can any one give me possible ideas or canonical answer to improve my current attempts?

like image 619
jyson Avatar asked Jul 13 '20 04:07

jyson


Video Answer


1 Answers

If doing series expansion as described, if the input has C channels and the expansion has T terms, the expanded input should have C*T channels and otherwise be the same shape. Thus, the original input and the function being approximated up to each term should be concatenated along the channel dimension. It is a bit easier to do this with a transpose and reshape than an actual concatenate.

Here is example code for a convolutional network trained on CIFAR10:

inputs = tf.keras.Input(shape=(32, 32, 3))

x = inputs
n_terms = 2
c = tf.constant([1, -1/6])
p = tf.constant([1, 3], dtype=tf.float32)

terms = []
for i in range(n_terms):
    m = c[i] * tf.math.pow(x, p[i])
    terms.append(m)
expansion = tf.math.cumsum(terms)
expansion_terms_last = tf.transpose(expansion, perm=[1, 2, 3, 4, 0])
x = tf.reshape(expansion_terms_last, tf.constant([-1, 32, 32, 3*n_terms])) 

x = Conv2D(32, (3, 3), input_shape=(32,32,3*n_terms))(x)

This assumes the original network (without expansion) would have a first layer that looks like this:

x = Conv2D(32, (3, 3), input_shape=(32,32,3))(inputs)

and the rest of the network is exactly the same as it would be without expansion.

terms contains a list of c_i * x ^ p_i from the original; expansion contains the sum of the terms (1st, then 1st and 2nd, etc), in a single tensor (where T is the first dimension). expansion_terms_last moves the T dimension to be last, and the reshape changes the shape from (..., C, T) to (..., C*T)

The output of model.summary() then looks like this:

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_4 (InputLayer)            [(None, 32, 32, 3)]  0                                            
__________________________________________________________________________________________________
tf_op_layer_Pow_6 (TensorFlowOp [(None, 32, 32, 3)]  0           input_4[0][0]                    
__________________________________________________________________________________________________
tf_op_layer_Pow_7 (TensorFlowOp [(None, 32, 32, 3)]  0           input_4[0][0]                    
__________________________________________________________________________________________________
tf_op_layer_Mul_6 (TensorFlowOp [(None, 32, 32, 3)]  0           tf_op_layer_Pow_6[0][0]          
__________________________________________________________________________________________________
tf_op_layer_Mul_7 (TensorFlowOp [(None, 32, 32, 3)]  0           tf_op_layer_Pow_7[0][0]          
__________________________________________________________________________________________________
tf_op_layer_x_3 (TensorFlowOpLa [(2, None, 32, 32, 3 0           tf_op_layer_Mul_6[0][0]          
                                                                 tf_op_layer_Mul_7[0][0]          
__________________________________________________________________________________________________
tf_op_layer_Cumsum_3 (TensorFlo [(2, None, 32, 32, 3 0           tf_op_layer_x_3[0][0]            
__________________________________________________________________________________________________
tf_op_layer_Transpose_3 (Tensor [(None, 32, 32, 3, 2 0           tf_op_layer_Cumsum_3[0][0]       
__________________________________________________________________________________________________
tf_op_layer_Reshape_3 (TensorFl [(None, 32, 32, 6)]  0           tf_op_layer_Transpose_3[0][0]    
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 30, 30, 32)   1760        tf_op_layer_Reshape_3[0][0]      

On CIFAR10, this network trains slightly better with expansion - maybe 1% accuracy gain (from 71 to 72%).

Step by step explanation of the code using sample data:

# create a sample input
x = tf.convert_to_tensor([[1,2,3],[4,5,6],[7,8,9]], dtype=tf.float32) # start with H=3, W=3
x = tf.expand_dims(x, axis=0) # add batch dimension N=1
x = tf.expand_dims(x, axis=3) # add channel dimension C=1
# x is now NHWC or (1, 3, 3, 1)

n_terms = 2 # expand to T=2
c = tf.constant([1, -1/6])
p = tf.constant([1, 3], dtype=tf.float32)

terms = []
for i in range(n_terms):
    # this simply calculates m = c_i * x ^ p_i
    m = c[i] * tf.math.pow(x, p[i])
    terms.append(m)
print(terms)
# list of two tensors with shape NHWC or (1, 3, 3, 1)

# calculate each partial sum
expansion = tf.math.cumsum(terms)
print(expansion.shape)
# tensor with shape TNHWC or (2, 1, 3, 3, 1)

# move the T dimension last
expansion_terms_last = tf.transpose(expansion, perm=[1, 2, 3, 4, 0])
print(expansion_terms_last.shape)
# tensor with shape NHWCT or (1, 3, 3, 1, 2)

# stack the last two dimensions together
x = tf.reshape(expansion_terms_last, tf.constant([-1, 3, 3, 1*2])) 
print(x.shape)
# tensor with shape NHW and C*T or (1, 3, 3, 2)
# if the input had 3 channels for example, this would be (1, 3, 3, 6)
# now use this as though it was the input

Key assumptions (1) The c_i and p_i are not learned parameters, therefore the "expansion neurons" are not actually neurons, they are just a multiply and sum node (althrough neurons sounds cooler :) and (2) the expansion happens for each input channel independently, thus C input channels expanded to T terms each produce C*T input features, but the T features from each channel are calculated completely independently of the other channels (it looks like that in the diagram), and (3) the input contains all the partial sums (ie c_1 * x ^ p_1, c_1 * x ^ p_1 + c_2 * x ^ p_2 and so forth) but does not contain the terms (again, looks like it in the diagram)

like image 88
Alex I Avatar answered Oct 19 '22 11:10

Alex I