Is semantic segmentation just a Pleonasm or is there a difference between "semantic segmentation" and "segmentation"? Is there a difference to "scene labeling" or "scene parsing"?
What is the difference between pixel-level and pixelwise segmentation?
(Side-question: When you have this kind of pixel-wise annotation, do you get object detection for free or is there still something to do?)
Please give a source for your definitions.
"Semantic segmentation" seems to be more used recently than "scene labeling"
Semantic segmentation associates every pixel of an image with a class label such as a person, flower, car and so on. It treats multiple objects of the same class as a single entity. In contrast, instance segmentation treats multiple objects of the same class as distinct individual instances.
Semantic segmentation is a dense prediction task that aims to provide a scene representation in which each pixel of an image is assigned a semantic class label.
Semantic segmentation is different from object detection as it does not predict any bounding boxes around the objects. We do not distinguish between different instances of the same object. For example, there could be multiple cars in the scene and all of them would have the same label.
To annotate images in semantic segmentation, outline the object carefully using the pen tool. Make sure touch the another end to cover the object entirely that will be shaded with a specific color to differentiate the object from nearby others.
"segmentation" is a partition of an image into several "coherent" parts, but without any attempt at understanding what these parts represent. One of the most famous works (but definitely not the first) is Shi and Malik "Normalized Cuts and Image Segmentation" PAMI 2000. These works attempt to define "coherence" in terms of low-level cues such as color, texture and smoothness of boundary. You can trace back these works to the Gestalt theory.
On the other hand "semantic segmentation" attempts to partition the image into semantically meaningful parts, and to classify each part into one of the pre-determined classes. You can also achieve the same goal by classifying each pixel (rather than the entire image/segment). In that case you are doing pixel-wise classification, which leads to the same end result but in a slightly different path...
So, I suppose you can say that "semantic segmentation", "scene labeling" and "pixelwise classification" are basically trying to achieve the same goal: semantically understanding the role of each pixel in the image. You can take many paths to reach that goal, and these paths lead to slight nuances in the terminology.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With