I have a dataset of images that I am trying to segment. For each image in the dataset, experts have randomly selected single pixels/points and added class annotations as to what class that pixel belongs to. In other words, each image will have about 60 points labeled thus:
x, y, class
How can I best leverage the knowledge of these single pixel annotations to perform a good semantic segmentation?
A similar question was asked before and the response was to use graph cuts:
"hard" supervision in image segmentation with python
Graph cuts in theory seems like a good candidate but would graph cuts work with single pixel annotations? Furthermore are there methods for it to work with a multiclass dataset? If so, is there a good library implementation or some good resources for this?
If not, what method would best fit this scenario? I played a bit with random walks but the resulting segmentation had poor edge localization (extremely rounded edges).
Any help, resources or examples you can give would be very appreciated (preferably with python libraries but I can really work with anything).
EDIT: My dataset has about 10 different classes, each image probably has about 5 on average. The annotators are not guaranteed to annotate every religion but it is rare to miss one (a few missing regions or incorrectly labeled regions are tolerable). The classes each correspond to texturally uniform areas and the textures are fairly constant (think sky, dirt, water, mountain). You can't get texture from a single point but almost all regions should have multiple points annotated.
An interesting problem. Since there is no concrete example to work on, I will only outline algorithmic approaches I would have tried myself.
In order to train a fully convolutional model for segmentation you don't necessarily have to have labels for all pixels. You may have "ignore_label": pixel labeled with that label are ignored and do not contribute to the loss.
Your case is an extreme case of "ignore_label" - you only have ~60 pixel labeled per image. Nevertheless, it may be interesting to see what you can learn with such sparse information.
Coming to think of it, you have more information per image than just the points labeled:
My dataset has about 10 different classes, each image probably has about 5 on average
Meaning that if image has labels for classes 1..5, you know it does not contain classes 6..10 (!) You may have a "positive term" in the loss assigning the very few labeled pixels to the right classes, and a negative term" for all the rest of the pixels that penalize if they are assigned to classes not present at all in the image.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With