I don't quite understand the following:
In the proposed FCN for Semantic Segmentation by Shelhamer et al, they propose a pixel-to-pixel prediction to construct masks/exact locations of objects in an image.
In the slightly modified version of the FCN for biomedical image segmentation, the U-net, the main difference seems to be "a concatenation with the correspondingly cropped feature map from the contracting path."
Now, why does this feature make a difference particularly for biomedical segmentation? The main differences I can point out for biomedical images vs other data sets is that in biomedical images there are not as rich set of features defining an object as for common every day objects. Also the size of the data set is limited. But is this extra feature inspired by these two facts or some other reason?
FCN vs U-Net:
FCN
U-Net
U-Net is built upon J. Long's FCN paper. A couple of differences is that the original FCN paper used the decoder half to upsample the classification (i.e the entire second half of the net is of depth C - number of classes)
U-Net's think of the second half as being in feature space and do the final classification at the end.
Nothing about it is special to bio-medical IMO
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With