Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Segmentation with Single Point Class Annotations via Graph Cuts?

I have a dataset of images that I am trying to segment. For each image in the dataset, experts have randomly selected single pixels/points and added class annotations as to what class that pixel belongs to. In other words, each image will have about 60 points labeled thus:

x, y, class

How can I best leverage the knowledge of these single pixel annotations to perform a good semantic segmentation?

A similar question was asked before and the response was to use graph cuts:

"hard" supervision in image segmentation with python

Graph cuts in theory seems like a good candidate but would graph cuts work with single pixel annotations? Furthermore are there methods for it to work with a multiclass dataset? If so, is there a good library implementation or some good resources for this?

If not, what method would best fit this scenario? I played a bit with random walks but the resulting segmentation had poor edge localization (extremely rounded edges).

Any help, resources or examples you can give would be very appreciated (preferably with python libraries but I can really work with anything).

EDIT: My dataset has about 10 different classes, each image probably has about 5 on average. The annotators are not guaranteed to annotate every religion but it is rare to miss one (a few missing regions or incorrectly labeled regions are tolerable). The classes each correspond to texturally uniform areas and the textures are fairly constant (think sky, dirt, water, mountain). You can't get texture from a single point but almost all regions should have multiple points annotated.

like image 464
Andrew King Avatar asked Aug 05 '17 01:08

Andrew King


1 Answers

An interesting problem. Since there is no concrete example to work on, I will only outline algorithmic approaches I would have tried myself.


Approach #1: use dense descriptors

  • Compute dense image descriptors (e.g., SIFT/HOG/Gabor or even better using pre-trained deep net like VGG).
  • Take the descriptors from all images from the annotated locations only: you should have ~10K descriptors with class labels. Train a simple classifier (e.g., SVM) on this set.
  • Go back to the images: apply the classifier and output log-probability for each pixel to belong to each of the classes. This should be the unary term (aka "data term") for you graph-cut.
  • Locally modify the unary term to force the annotated points to belong to the right class.
  • Use simple pair-wise term (image gradients or some edge based term).
  • Apply Graph-Cut to get semantic segmentation.

Approach #2: train your own deep semantic segmentation model

In order to train a fully convolutional model for segmentation you don't necessarily have to have labels for all pixels. You may have "ignore_label": pixel labeled with that label are ignored and do not contribute to the loss.
Your case is an extreme case of "ignore_label" - you only have ~60 pixel labeled per image. Nevertheless, it may be interesting to see what you can learn with such sparse information.

Coming to think of it, you have more information per image than just the points labeled:

My dataset has about 10 different classes, each image probably has about 5 on average

Meaning that if image has labels for classes 1..5, you know it does not contain classes 6..10 (!) You may have a "positive term" in the loss assigning the very few labeled pixels to the right classes, and a negative term" for all the rest of the pixels that penalize if they are assigned to classes not present at all in the image.

like image 135
Shai Avatar answered Oct 18 '22 07:10

Shai