Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tf.nn.depthwise_conv2d is too slow. is it normal?

I am trying out a recent arxiv work called "Factorized CNN",

which mainly argues that spatially separated convolution (depth-wise convolution), together with channel-wise linear projection(1x1conv), can speed up the convolution operation.

this is the figure for their conv layer architecture

I found out that I can implement this architecture with tf.nn.depthwise_conv2d and 1x1 convolution, or with tf.nn.separable_conv2d.

below is my implementation:

#conv filter for depthwise convolution
depthwise_filter = tf.get_variable("depth_conv_w", [3,3,64,1], initializer=tf.random_normal_initializer(stddev=np.sqrt(2.0/9/32)))
#conv filter for linear channel projection
pointwise_filter = tf.get_variable("point_conv_w", [1,1,64,64], initializer=tf.random_normal_initializer(stddev=np.sqrt(2.0/1/64)))
conv_b = tf.get_variable("conv_b", [64], initializer=tf.constant_initializer(0))
#depthwise convolution, with multiplier 1
conv_tensor = tf.nn.relu(tf.nn.depthwise_conv2d(tensor, depthwise_filter, [1,1,1,1], padding='SAME'))
#linear channel projection with 1x1 convolution
conv_tensor = tf.nn.bias_add(tf.nn.conv2d(conv_tensor, pointwise_filter, [1,1,1,1], padding='VALID'), conv_b)
#residual
tensor = tf.add(tensor, conv_tensor)

This should be around 9 times faster than the original 3x3x64 -> 64 channel convolution.

However, I cannot experience any performance improvement.

I must assume that I am doing this wrong, or there's something wrong with tensorflow's implementation.

Since there is few example using depthwise_conv2d, I am leaving this question here.

Is this slow speed normal? or is there any mistake?

like image 210
Jong Chan Park Avatar asked Sep 07 '16 11:09

Jong Chan Park


People also ask

What is depthwiseconv2d?

Depthwise convolution is a type of convolution in which each input channel is convolved with a different kernel (called a depthwise kernel). You can understand depthwise convolution as the first step in a depthwise separable convolution.

What is separableconv2d?

Separable convolutions consist of first performing a depthwise spatial convolution (which acts on each input channel separately) followed by a pointwise convolution which mixes the resulting output channels.


2 Answers

the current implementation of depthwise conv2d is not fully utilizing the parallel power from GPU, you need to wait for a faster implementation in the future, for example, in caffe, there exists faster third-party impl of this kernel https://github.com/yonghenglh6/DepthwiseConvolution

like image 98
Zaikun Xu Avatar answered Sep 28 '22 19:09

Zaikun Xu


Depthwise convolutions provide significant performance benefits owing to the reduction in both parameters and mult-adds. However, training depthwise convolution layers with GPUs is slow in current deep learning frameworks because their implementations cannot fully utilize the GPU capacity.

https://arxiv.org/pdf/1803.09926.pdf

like image 20
mrgloom Avatar answered Sep 28 '22 18:09

mrgloom