patch-wise training and fully convolutional training in FCN

Q: What is the difference between patchwise and convolutional training?

Patchwise training explicitly crops out the subimages and produces outputs for each subimage in independent forward passes. Therefore, fully convolutional training is usually substantially faster than patchwise training.

Q: What is the training set construction for fully convolutional neural network?

In the paper of fully convolutional neural network, the authors mention both patch wise training and fully convolutional training. My understanding for the training set construction is as follows: Given an M*M image, extract sub-images with N*N, where ( N<M ). The selected sub-images are overlapped with eath other.

Q: What is a fully convolutional network (FCN)?

The first thing that struck me was fully convolutional networks (FCNs). FCN is a network that does not contain any “Dense” layers (as in traditional CNNs) instead it contains 1x1 convolutions that perform the task of fully connected layers (Dense layers).

Q: What is the use of @FCN_model in machine learning?

FCN_model: We need to specify the number of classes required in the final output layer. The above objects are passed to the train () function which compiles the model with Adam optimizer and categorical cross-entropy loss function. We create a checkpoint callback which saves the best model during training.

Tags:

tensorflow

deep-learning

keras

caffe

In the FCN paper, the authors discuss the patch wise training and fully convolutional training. What is the difference between these two?

Please refer to section 4.4 attached in the following.

It seems to me that the training mechanism is as follows, Assume the original image is M*M, then iterate the M*M pixels to extract N*N patch (where N<M). The iteration stride can some number like N/3 to generate overlapping patches. Moreover, assume each single image corresponds to 20 patches, then we can put these 20 patches or 60 patches(if we want to have 3 images) into a single mini-batch for training. Is this understanding right? It seems to me that this so-called fully convolutional training is the same as patch-wise training.

enter image description here

723

asked Mar 06 '17 22:03

user288609

1 Answers

The term "Fully Convolutional Training" just means replacing fully-connected layer with convolutional layers so that the whole network contains just convolutional layers (and pooling layers).

The term "Patchwise training" is intended to avoid the redundancies of full image training. In semantic segmentation, given that you are classifying each pixel in the image, by using the whole image, you are adding a lot of redundancy in the input. A standard approach to avoid this during training segmentation networks is to feed the network with batches of random patches (small image regions surrounding the objects of interest) from the training set instead of full images. This "patchwise sampling" ensures that the input has enough variance and is a valid representation of the training dataset (the mini-batch should have the same distribution as the training set). This technique also helps to converge faster and to balance the classes. In this paper, they claim that is it not necessary to use patch-wise training and if you want to balance the classes you can weight or sample the loss. In a different perspective, the problem with full image training in per-pixel segmentation is that the input image has a lot of spatial correlation. To fix this, you can either sample patches from the training set (patchwise training) or sample the loss from the whole image. That is why the subsection is called "Patchwise training is loss sampling". So by "restricting the loss to a randomly sampled subset of its spatial terms excludes patches from the gradient computation." They tried this "loss sampling" by randomly ignoring cells from the last layer so the loss is not calculated over the whole image.

184

answered Sep 27 '22 20:09

Juan Terven

Related questions
                            
                                TensorFlow core debug; missing debug symbols
                            
                                Can I use TensorFlow in a Google App Engine module?
                            
                                TensorFlow network not training?
                            
                                Tensorflow weights for kernels of convolution for colored images?
                            
                                Tensorflow, py_func, or custom function
                            
                                Get weights from tensorflow model
                            
                                How can I take the 2nd max from a Tensorflow tensor?
                            
                                Accessing neural network weights and neuron activations
                            
                                How to implement multi-class hinge loss in tensorflow
                            
                                Why can't tensorflow determine the shape of this expression?
                            
                                TensorFlow has no attribute "with_dependencies"
                            
                                How many iterations a needed to train tensorflow with the entire MNIST data set (60000 images)?
                            
                                Unicode in the standard TensorFlow format
                            
                                The output of a softmax isn't supposed to have zeros, right?
                            
                                Implementing gradient descent in TensorFlow instead of using the one provided with it
                            
                                Gradient clipping appears to choke on None
                            
                                Getting low test accuracy using Tensorflow batch_norm function
                            
                                Saving the state of the AdaGrad algorithm in Tensorflow
                            
                                Tensorflow AttributeError: 'DataSet' object has no attribute 'image'
                            
                                What is SYCL 1.2?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With