In tensorflow/models/slim/nets, here is the link of relative snippets of vgg. I'am curious about why slim.nets.vgg use conv2d instead of fully_connected layers, although it works out the same way actually? Is it for the purpose of speed? I appreciate some explanations. Thanks!
After a while, I think there is at least one reason that it can avoid weights converting mistakes.
Tensorflow/slim as well as other high-level libraries allows tensor formats either BHWC
(batch_size, height, width, channel. Same below) by default or BCHW
(for better performance).
When converting weights between these two formats, the weights [in_channel, out_channel]
of first fc (fully connected layer, after conv layer) have to been reshaped to [last_conv_channel, height, width, out_channel]
for example, then transposed to [height, width, last_conv_channel, out_channel]
and reshaped again to [in_channel, out_channel]
.
And if you use conv weights rather than fully connected weights, such transformation will explicitly applied on fc layer (conv weights actually). Surely will avoid converting mistakes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With