Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Intuition behind U-net vs FCN for semantic segmentation

I don't quite understand the following:

In the proposed FCN for Semantic Segmentation by Shelhamer et al, they propose a pixel-to-pixel prediction to construct masks/exact locations of objects in an image.

In the slightly modified version of the FCN for biomedical image segmentation, the U-net, the main difference seems to be "a concatenation with the correspondingly cropped feature map from the contracting path."

Now, why does this feature make a difference particularly for biomedical segmentation? The main differences I can point out for biomedical images vs other data sets is that in biomedical images there are not as rich set of features defining an object as for common every day objects. Also the size of the data set is limited. But is this extra feature inspired by these two facts or some other reason?

like image 356
Jonathan Avatar asked May 08 '18 18:05

Jonathan


2 Answers

FCN vs U-Net:

FCN

  1. It upsamples only once. i.e. it has only one layer in the decoder
  2. The original implementation github repo uses bilinear interpolation for upsampling the convoloved image. That is there is no learnable filter here
  3. variants of FCN-[FCN 16s and FCN 8s] add the skip connections from lower layers to make the output robust to scale changes

U-Net

  1. multiple upsampling layers
  2. uses skip connections and concatenates instead of adding up
  3. uses learnable weight filters instead of fixed interpolation technique
like image 69
shasvat desai Avatar answered Oct 20 '22 12:10

shasvat desai


U-Net is built upon J. Long's FCN paper. A couple of differences is that the original FCN paper used the decoder half to upsample the classification (i.e the entire second half of the net is of depth C - number of classes)

U-Net's think of the second half as being in feature space and do the final classification at the end.

Nothing about it is special to bio-medical IMO

like image 38
aivision2020 Avatar answered Oct 20 '22 14:10

aivision2020